x Contact us ask us for an offer
Thank you for your message. We will contact you as soon as we understand your needs!
  • Witaj Hello!
  • Masz pytania? Any questions?
  • Skontaktuj się contact us
    we can help you with your business!
Blog
  • 05
    Feb
    Author: matix February 5, 2018 15:38
    How to improve performance of e-commerce application using ElasticSearch

    Probably everyone heard of ElasticSearch. It's just a search database, using Zend Lucene engine for searching the most relevant results.
    If you are familar with Apache Solr or Sphinx - You will probably understand ElasticSearch aswell.

    Agenda
    1. What is Elastic Search for?
    2. Performance problem - key:value database schema, lot's of joins, etc,
    3. The solution of problem - ElasticSearch,



    What is Elastic Search?


    What is Elastic Search?
    Elasticsearch is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases.
    It's just a search database, using Zend Lucene engine for searching the most relevant results.

    But ... this post goal is more about practice, so let's leave the theory.


    Key Value Database



    Our customer had a huge performacne problem. Whole application structure, such as categories, products (e-commerce application), orders.. are stored in single Table (Key:Value).
    The code fragment (problematic) of the app is shown below:

    /**
    * @ORM\Entity
    * @ORM\Table(name="TaxonomyItem")
    */
    class TaxonomyItem
    {
    /**
    * @ORM\Column(type="integer")
    * @ORM\Id
    * @ORM\GeneratedValue(strategy="AUTO")
    */
    protected $id;

    /**
    * @ORM\Column(type="string", length=100)
    * @Assert\NotBlank()
    */
    protected $title;

    /**
    * @Gedmo\Slug(fields={"title"}, updatable=false)
    * @ORM\Column(length=128, unique=true)
    */
    protected $slug;

    /**
    * @ORM\ManyToOne(targetEntity="TaxonomyCategory", inversedBy="taxonomyItems")
    * @ORM\JoinColumn(name="taxonomy_category_id", referencedColumnName="id")
    */
    protected $taxonomyCategory;

    /**
    * @ORM\OneToMany(targetEntity="TaxonomyField", mappedBy="taxonomyItem",cascade={"persist", "remove"})
    */
    protected $customFields;


    As we can see, every object in the database (ex. Product) has his title and id. Other elements (such as: description, photos) are joined from TaxonomyFields (Key:Value) table.
    The basic problem is the performance but... how about searching ? Yes, you can search by title... but what about searching for description? Big, slow query.



    Live example:

    We have 30 products on a page. Every product has fields: description, description_en. 31 queries are generated, because:
    1. We fetch 30 TaxonomyItems (products) - 1 query,
    2. For every single TaxonomyItem, we fetch the details from TaxonomyField table - 30 queries

    1 + 30 = 31 queries.

    When the real customers and suppliers registered on that platform, the performance problem occured.
    Some of the requests generated 3000 database queries! For single user.


    ElasticSearch was the solution...


    The first solution, that we were thinking about was... redesign the database for standard entities, such as:

    1. Product (id, title, description, photos)
    2. Order (id, userId, List products, ...)
    3. User (id, username, passwd, salt, ...)

    But it would take too much time. We did not have so much time so....

    The solution was... ElasticSearch.
    We indexed the products with all the neccessary attributes. It took 16 hours to write some piece of code that handles it.

    Stage 1 - ElasticSearch structure for Product

    Let's create the ProductIndex which will contain all the neccessary fields for a single Product item.

    class ProductIndex {
    protected $title;
    protected $title_en;
    protected $description;
    protected $description_en;
    protected $photo;
    protected $category;
    protected $categoryTree;
    ....
    }


    Every single Product will be generated from TaxonomyItem + (JOIN) TaxonomyFields MySQL Table.
    We also included $categoryTree variable, that stores generated List categories, the products belongs to. It will be usefull for breadcrumbs for example.

    Symfony Command Line for indexing all products from MySQL database.

    namespace AppBundle\Command;

    use AppBundle\Entity\User;
    use AppBundle\Services\ElasticSearchService;
    use AppBundle\Services\TaxonomyManager;
    use Symfony\Bundle\FrameworkBundle\Command\ContainerAwareCommand;
    use Symfony\Component\Console\Input\InputArgument;
    use Symfony\Component\Console\Input\InputInterface;
    use Symfony\Component\Console\Input\InputOption;
    use Symfony\Component\Console\Output\Output;
    use Symfony\Component\Console\Output\OutputInterface;

    use League\Csv\Reader;
    use League\Csv\Statement;

    class BuildProductIndexCommand extends ContainerAwareCommand
    {
    protected function configure()
    {
    $this
    ->setName('index:products')
    ;
    }

    protected function execute(InputInterface $input, OutputInterface $output)
    {
    define('tradeSubdomain', 'hagebau');

    $container = $this->getContainer();
    $output->writeln('Building index...');

    /** @var TaxonomyManager $taxonomyManager */
    $taxonomyManager = $container->get('manager.taxonomy');

    /** @var ElasticSearchService $elasticSearchService */
    $elasticSearchService = $container->get('service.elastic_search');
    $products = $taxonomyManager->findAllTaxonomyItems('products');

    foreach ($products as $product) {
    if ((String) $product->lookup('root_product') != '') {
    continue;
    }

    if ((String) $product->lookup('category') == '') {
    continue;
    }

    $elasticSearchService->buildProductIndex($product);
    $output->writeln(sprintf('Product with id: %s has been indexed.', $product->getId()));
    }
    }
    }




    As we can see, the buildProductIndex($product) method is doing all the job - it stores the data to ElasticSearch database.
    The implementation is shown below:

    public function buildProductIndex(TaxonomyItem $product)
    {
    $this->elasticSearchClient->setType('product');

    $fieldsToIndex = [
    'slug', 'title', 'title_en', 'long_description', 'long_description_en', 'important_information', 'important_information_en',
    'industry', 'agb', 'agb_en', 'photo', 'uncountable', 'responsible_person', 'custom_template', 'category', 'thumbnail',
    ];

    $index = [
    'id' => $product->getId(),
    'position' => $product->lookup('position') ? (Integer) $product->lookup('position')->getValue() : null,
    'invoice_include' => (Boolean) $product->lookup('invoice_include'),
    'price' => unserialize((String) $product->lookup('price')),
    'price_en' => unserialize((String) $product->lookup('price_en')),
    ];

    foreach ($fieldsToIndex as $field) {
    $index[$field] = (String) $product->lookup($field);
    }

    $categoryTree = [];
    $parent = $product->lookup('category');

    do {
    $c = $this->taxonomyManager->findOneTaxonomyItem('categories', ['slug' => $parent]);
    if (!$c) {
    break;
    }

    $category = $this->taxonomyManager->build($c);

    array_unshift($categoryTree, $category);

    $parent = $category->lookup('subcategory')->getValue();
    } while ($parent != '');

    $tree = [];
    foreach ($categoryTree as $t) {
    $tree[] = [
    'slug' => (String) $t->lookup('slug'),
    'title' => (String) $t->lookup('title'),
    'title_en' => (String) $t->lookup('title_en'),
    ];
    }

    $index['category_path'] = $tree;

    if (count($tree) == 0) {
    return;
    }

    return $this->elasticSearchClient->index($index, $product->getId());
    }


    Server resources usage has been reduced by 40%. Same like MySQL DB queries - reduced by 80-90%.

    Homepage (Before 80 queries per req) => after ES implementation - 3 queries,
    Product Search Page (Before 140 queries per req) => after ES implementation - 14 queries,
    Category Page (Before 75 queries per req) => after ES implementation - 7 queries,

    If you are interested more about ElasticSearch or performance solutions, feel free to contact us via e-Mail: biuro@wiseweb.pl.

    Kind Regards,
    Mateusz