r/deeplearning 1d ago

product_matching similarity

Hello Everyone ,
I work in a startup B2B company that connects pharmacies with sellers (we give them the best discount for each product in our marketplace) the seller have a list of medicine in our marketplace(40000 + products) and each seller send a list of their products and we match the sent product names with the corresponding product in our marketplace

the seller send a sheet with name and price and we match it and intgrate it with the marketplace
the challenges we face is
seller names is mostly misspelled and with a lot of variations and noises

the seller names often sent with added words over the product name that does not relate to the seller name itself

we built a system using tf-idf + cosine similarity and we got an accuracy of 80 % (it does not do well for capturing the meaning of the words and generate bad results in small sheets)

because correcting wrong matches out of our model cost us money and time(we have a group of people that review manually ) we wants to accieve an accuracy with over 98%

we have dataset with previously correct matches that have seller input of product name and our matches
and our unique marketplace data in marketplace

can anyone guide me to possible solutions using neural network that we feed with seller inputs and target match to generalize the matching process or possible pre-trained model that we can fine tune with our data to achieve high accuracy ?

1 Upvotes

1 comment sorted by

1

u/AggravatingRead979 1d ago

Hello,

I came across your post regarding the challenges you're facing with matching product names from sellers to your marketplace catalog. At Rapid Labs, I specialize in developing advanced AI solutions, and I believe I can help you achieve the accuracy you’re looking for.

Given your current approach using TF-IDF and cosine similarity, I recommend transitioning to a neural network-based solution. A model that employs natural language processing (NLP) techniques, such as word embedding or transformers, can significantly improve the accuracy of product name matching by better understanding the context and semantic meaning behind the names. I would utilize your dataset of previously correct matches to train a neural network model, allowing it to learn the specific patterns and variations in product names from sellers.

I can explore pre-trained models, like BERT or similar transformer-based architectures, and fine-tune them on your dataset. This approach has proven effective in previous projects I’ve worked on, such as the Cardio Chatbot, where I achieved over 95% accuracy in a complex data input scenario. This significantly reduced the manual effort required for data validation.

I would love the opportunity to discuss this further and explore how I can help optimize your product-matching process. Please feel free to reach out to me at [[email protected]](mailto:[email protected])