Product Attribute Extraction Based Real-Time C2C Matching of Microblogging Messages

Mohamed Rilf • Dilum Bandara • Surangika Ranathunga

10:30 - 11:00 | Wednesday 30 May 2018 | 1st Floor Lobby



We describe a solution for real-time matching of microblogging messages related to product selling or buying. C2C buy/sell interest matching in real time is nontrivial due to the complexities of interpreting social media messages, number of messages, and diversity of products/services. Therefore, we adopt a combination of techniques from natural language processing, complex event processing, and distributed systems. First, we convert the message into semantics using named-entity recognition with CRF and Logistic Regression. Then the extracted data are matched using a complex event processor. Moreover, NoSQL and inmemory computing are used to enhance the scalability and performance. The proposed solution shows a high accuracy where classification and CRF models recorded an accuracy of 98.5% and 82.07% when applied to a real-world dataset. Low latency was observed for information extraction, in-memory data manipulation, and complex event processing were latencies were 0.5 ms, 5 ms, and 3.6 ms, respectively.