MICES 2024

Mix-camp E-commerce Search

When

12th June 2024

Opening at 8:30 am
Starting at 9:00 am

Where

Berlin, Germany

KOPF, HAND + FUSS gGmbH
c/o TUECHTIG
Oudenarder Straße 16
House D06, 1st floor
13347 Berlin

This was MICES EU 2024!

You will find the slides and videos from the talks in the Programme section below.

Format

MICES brings together participants from a variety of backgrounds, all sharing a common interest in e-commerce search. In order to stimulate and facilitate discussion, we will start with scheduled talks in the morning. This will be followed by self-organising sessions: participants are encouraged to initiate discussions about their topics of interest or to give an ad hoc presentation.

Topics

The workshop welcomes all topics related to e-commerce search. We have compiled a list of topics that will probably feature at the workshop. This list is not comprehensive. Participants are welcome to discuss further topics at the workshop.

  • Managing e-commerce search: finding the right process and organisation for it, and how does it change over time?
  • How can we measure and improve search quality for e-commerce?
  • What are the best search relevance models for e-commerce search?
  • Personalised e-commerce search
  • Choosing the right search technology
  • Tools and processes for 'Searchandizing'
  • From keyword search to virtual shopping assistants: designing the e-commerce search user experience
  • Exploring artificial intelligence, visual and voice search for e-commerce
  • How to deal with poor data quality in e-commerce search

Past events

Programme

MICES 2024

08:30

Registration

09:00 - 10:00

Welcome & Introduction

René Kriegler / Sebastian Russ
Organisers

Slides

Planning of e-commerce search relevance work

Doug Turnbull
Reddit

Is your quarterly search planning a headache? Maybe you're a PM that wants to try the next big AI thingy. Or a techie and you want to explore a cool conference talk. But herein lies a paradox - how can you try these new ideas, without prematurely making a deep investment in time and people? In this talk, I will propose a methodology for rapid prototyping of dozens of search relevance ideas at once. By using the magic of SmallData™ - quick, timeboxed ideas can test whether promise exists before going deeper into building. I’ll talk through how I’ve evolved my thinking of realistic relevance planning within an organization. How to we empower partners / our own curiosity without constantly saying no… Yet at the same time minimizing time spent on work that doesn’t lead to success. I’ll walk through how I think about this with a real e-commerce dataset, thinking through how to maximize the number of ideas we try, not just getting obsessed during the project.

Doug Turnbull has been enthusiastic about search relevance since 2013. He co-authored Relevant Search and AI Powered Search. He created Quepid and Splainer for search relevance testing. He co-created the Elasticsearch Learning to Rank plugin with Wikimedia Foundation and Snagajob. Doug loves learning from other search practitioners, and hopes you'll bring inquisitive curiosity and experiences to this talk.
Doug currently works at Reddit where he's helping bring Machine Learning to search. Recently Doug worked at Shopify to help improve merchant search attributed revenue by 19% year over year. Doug spent 8 years consulting at dozens of organizations during his time as CTO at OpenSource Connections. Doug blogs about search and other topics at http://softwaredoug.com.

Slides
Video (Youtube)

10:00 - 10:15

Break

10:15 - 11:20

Vectorizing consumer electronic goods

Ruchi Juneja (MediaMarktSaturn) & Johannes Peter (Principal Search Consultant)

Comprehensive quality analyses of our keyword-based search revealed that it works very well for the short-head, but has room for improvement in the long-tail area. As a consequence, we decided to enhance our keyword-based search by the vector search approach since it appears to be promising to address the identified long-tail related issues. The goal of our first experimental iteration was to evaluate, whether publicly available models are able to help us when they are fine-tuned with our own data. The outcome of this experiment was that all fine-tuned public models showed an extremely poor search performance, irrespective of size of training data or number of epochs. The reasons were quickly identified: 1) the data and especially the vocabulary of these models are only minimally representative for descriptions of consumer electronic goods, 2) the product descriptions used for the training contained a lot of irrelevant information and 3) the way vector search works out-of-the-box is only partially suitable for e-commerce purposes and requires customization. Consequently, the goals of our second experimental iteration were 1) to train our own model from scratch, 2) to build our own use-case-specific product descriptions and 3) to customize the model layers in order to eliminate undesirable factors like length of description or term frequency. The results of this experimental iteration will be discussed.

Ruchi works as a Sr. Data Scientist at MediaMarktSaturn in the Search ML team. For over a decade, she has immersed herself in the world of data science working on various business problems. The last five years have truly defined her path to the fascinating realm of search. Her passion for search technologies has led her on a journey across various industries, from education to music and e-commerce, delving deep into the intricacies of search algorithms, specialising in search ranking and query understanding.

Johannes is a passionate leader, consultant and engineer in the areas of search, data and cloud. He is experienced in driving and implementing solutions from initial ideas towards systems running in productive environments and measurably contributing to business KPIs. Throughout his entire career, Johannes has been passionate about open source. In his early days, he contributed several features to Apache NiFi to improve its integration with Apache Solr. Later he became a main contributor and committer for Querqy, a query rewriting solution for Apache Solr and Elasticsearch used by various big retail companies worldwide.

Slides
Video (Youtube)

How semantic search projects fail

Roman Grebennikov
Delivery Hero

Build embeddings with your preferred vendor and put them into the vector search database. What can go wrong? However, in practice, this is the moment where real problems arise:

- Different embeddings, but the search results are very similar and still bad. Do embeddings matter? Should we opt for OpenAI or open-source?
- Customers signal that “ketchup” is irrelevant to the “tomato” query, but the ML model thinks otherwise. How can we control a black-box embedding? Is there value in fine-tuning?
- Search results are relevant for English queries but not as relevant for Spanish-Arabic-Chinese. How should we handle multilingual searches? What about typos?
- A semantic search always returns results, even for the most wrong and bizarre queries such as ‘how to bake a car’ or ‘why is the sky made of cheese’. When should you stop the search? How can you balance precision and recall?

At Delivery Hero, we made three attempts to implement a semantic search for all 40 countries where we deliver food — only the last one **partially** surpassed the strong lexical search baseline we spent years hand-tuning. In this talk, we’ll discuss challenges and failures we encountered throughout this journey.

Roman is Principal ML engineer at Delivery Hero and an ex startup CTO working on modern search and recommendations problems. A pragmatic fan of open-source software, functional programming, LLMs and performance engineering.

Slides
Video (Youtube)

11:20 - 11:40

Break

11:40 - 12:45

Offline evaluation of product search with model-based judgments at Shopify

Alberto Castelo Becerra
Shopify

At Shopify the efficacy of search systems significantly influences user experience and business performance. Offline evaluation of search is crucial for enabling fast and efficient iteration cycles necessary for continuous improvement. However, the reliance on raw behavioral data such as clicks and orders presents challenges due to its inherent sparsity and multiple biases. In this talk, I will discuss how cutting-edge machine learning models, specifically cross-encoder models, can be utilized to overcome these challenges. We will explore how these models can robustly assess and enhance the output of product search pipelines, ensuring more accurate evaluations and scalable testing before deployment.

Alberto is a Senior Machine Learning engineer at Search Foundations team at Shopify where he has been for the last 3 years. He primarily focuses on leveraging machine learning for search and recommendations problems.

Slides
Video (Youtube)

Image Search for Product Recommendations: the Good, the Bad, and the Ugly Use Cases

Paul-Louis Nech & Raed Chammam
Algolia

What data do you use to fuel your product carousels, be it on category pages or on add-to-cart UIs? Do you use only textual attributes? Business-relevant metadata such as margins and stock levels? Or even user behavior metrics like clicks and conversions? These are all good sources of data to recommend meaningful products to your customers. However, most e-commerce experiences have another very valuable kind of data: solid product images.

Such imagery allows e-commerce businesses to leverage image search technologies to offer relevant suggestions based on their catalog: recommend similar items when one is unavailable, identify complementary products from their looks, and many more creative use cases. But what makes a good image recommendation UX in the e-commerce world? Is it just the quality of the product itself that you see? The relevance of this item in the context of what you’re looking at? Maybe it’s simply that some use cases shine while others are, hum, interesting?

In this talk, Raed and Paul-Louis will walk you through several real-life examples using LookingSimilar to build visual product carousels for different e-commerce businesses targeting various kinds of users. You’ll see use cases that are really good, some initiatives that are rather bad… and a few experiments which can only be described as ugly. The audience will get tips to find what kind of use cases would benefit most from image search based product recommendations, and how to adapt their data to make the most of it.

Paul-Louis Nech is a Machine Learning engineer with 10 years of experience crafting software for a global audience. Throughout his career, he has leveraged advanced algorithms to empower developers and users alike. From SwiftKey to Algolia, Paul-Louis has built tools that bridge the gap between state-of-the-art algorithms (which work in the lab) and end products (which succeed in the wild). This requires not only great technology, but more importantly great user-experience, to make sure you build something people want.

Raed Chammam is a software engineer with a profound passion for frontend development and a diverse background in various programming stacks. With experience ranging from e-commerce to search engines, he focuses on delivering delightful and consistent user experiences. At the conference, Raed will share valuable insights on crafting easy-to-use interfaces for complex technologies.

Slides
Video (Youtube)

12:45 - 13:45

Lunch

13:45 - 14:50

User behavior insights

Stavros Macrakis (OpenSearch at AWS) & Charlie Hull (OpenSource Connections)

Every e-commerce search professional needs data about users' behavior. Data is fundamental for analyzing user behavior and improving search relevance, both with manual tuning and with machine learning. The User Behavior Insights system provides a standard way to do that. Until now, collecting fine-grained user behavior data has been haphazard. Many search developers have developed their own ad hoc data collection systems and analysis tools. Proprietary systems collect data but mostly track page-to-page flow, and not granular data on the flow of search results through the system, mostly sharing aggregations with their customers. And open-source systems generally don't offer data collection mechanisms. Our open-source User Behavior Insights (UBI) system provides a client-side library for instrumenting web pages, server-side tools for collecting data, and analytical tools for understanding it. Critically, it defines a standard schema for behavior data so that the community can contribute additional analytical tools. We will share a call to action to the e-commerce search community on the need to make it simpler to seamlessly track, the steps of a user’s search journey in an ethical and safe manner in order to build the experiences of the future.

Stavros is passionate about search relevance. At AWS, Google, GLG, FAST, and Lycos, he has worked with a wide variety of organizations and applications, and is frequently astounded at how even large, sophisticated organizations fail to collect the data needed to evaluate and tune their search systems.

Charlie Hull is the Marketing Director at OSC and also leads client projects. He keeps a strategic view on developments in the search industry and is in demand as a speaker at conferences across the world.

Slides
Video (Youtube)

Search & Privacy as One

Ángel Maldonado (Motive.co, Empathy.co, VisiblePrivacy.com) & Alex Barrett (Spaceheater.ai) & Ben Cooper (Spaceheater.ai)

AI enhanced Search & Privacy can co-exist when the wealth of possible customer data lives in a neutral space, one that is intrinsically trusted. In this talk, Alex and Angel will share three stories that demonstrate that there are alternatives to centralised AI Search when user rights are considered a priori.

Through these three different retrieval experiences (Customers, Merchants and CPGs), the talk exemplifies a realisation of retrieval that encapsulates privacy, control and agency within the user's computational capsule.

Angel is a computer scientist with a passion for philosophy and design. Founded the co-authoring platform Open Innovation (oi.empathy.co) to finance and bring to market new ventures such as PrivacyCloud.com, Empathy.co, Motive.co, VisiblePrivacy.com or the EthicalAlliance.co. Lives in London and enjoys music and meditation :)

With almost a decade of experience in early-stage consumer-facing Me2B / SaaS / CX related startups, Alex Barrett has put forth a great deal of thought and energy, passionately focused on customer-centric experience, relationships and behaviors. Alex’s latest venture Spaceheater.ai, aims to unlock previously unimaginable digital experiences through the prioritization and empowerment of user privacy and control.

Slides
Video (Youtube)

14:50 - 15:05

Break

15:05 - 15:35

An AI Assistant in the life of a Search Engine Administrator

Lucian Precup & Maëlly Dubois
all.site / Adelean

The integration of AI assistants into administrative tools like Kibana and OpenSearch Dashboards has opened new possibilities beyond traditional observability and monitoring functions. In this abstract, we propose an innovative AI assistant tailored specifically for e-commerce administration consoles, focusing on optimizing search engine relevance and performance. Unlike existing assistants, our solution not only responds to natural language commands but also proactively suggests actionable insights to improve user experience. For instance, it can recommend creating filtered redirections based on user behavior patterns or propose synonym enhancements to minimize zero results queries. Moreover, administrators can leverage the assistant to execute complex tasks such as adjusting product visibility or fine-tuning search parameters through intuitive dialogue. Our presentation will delve into the architectural requirements for implementing such an assistant effectively, as well as outline a comprehensive set of capabilities and recommendations for maximizing its utility in real-world scenarios. Join us to explore the future of AI-driven e-commerce administration and discover how our innovative approach can revolutionize search engine management.

With his colleagues at Adelean, Lucian develops solutions for indexing, searching and analyzing data. Lucian regularly shares his knowledge in specialized conferences and organizes the Search, Data and AI Meetup in Paris.

As a Product Owner at Adelean, Maëlly develops a2 - an e-commerce search engine built on top of Apache Lucene and all.site - the collaborative insight engine and copilot created at Station F in Paris.

Slides
Video (Youtube)

15:40 - ca. 18:00

Barcamp / self-organising sessions

Organisers & Partners

René Kriegler

René works as Chief Strategy Officer at OpenSource Connections. As a search consultant he has supported clients in Germany and abroad for 17 years. Although he is interested in all aspects of search, key areas include search relevance consulting, e-commerce search and Apache Solr/Lucene. René maintains the Querqy open source library. He co-founded MICES in 2017 together with Paul Bartusch and Isabell Drost-From.

Sebastian Russ

Sebastian works as Product Manager at Tudock focussing on e-commerce search. His passion for search is reflected in a wide range of search projects with both closed and open source solutions. Driven by couriosity and the challenging nature of search topics he believes that collaboration between technology, business and design brings up the best results. MICES is a great place to make that happen and Sebastian is happy to be able contribute to the search community he learned so much from.

Berlin Buzzwords

MICES is partnering with Berlin Buzzwords and takes place on the day after the main conference. See here for information and tickets.

Sponsors

MICES is sponsored by

Tudock OpenSource Connections Ethical Commerce Alliance

Keep me informed

 

Follow us on Twitter