The potential of social media (SM) as a dissemination channel for traffic information is becoming increasingly apparent. Many authorities around the world have created dedicated SM accounts and are using them as a two-way communications channel with the public. However, travellers, and particularly tourists, seeking transport-related information do not necessarily turn to official SM accounts as their first choice, preferring instead content-sharing services such as Question & Answer (Q&A) forums offered by (for example) TripAdvisor. The main interest of the transport authority is to ensure that information conveyed to travellers is of high quality and, above all, correct. Given the large number of questions posted in Q&A forums, carrying out by hand the tasks of scanning all questions, identifying those that are transport-related and checking the quality of replies would be time-consuming and impractical. In this paper we present a methodology for automatically categorizing transport-related questions posted in Q&A forums such as those of TripAdvisor, and extracting questions seeking travel instructions. We describe how we developed the necessary classifiers, and we demonstrate their applicability to various cities. We also demonstrate the feasibility of automatically extracting the origin and destination referred to in questions posted in TripAdvisor, thus enabling authorities to use the provided methodology to glean ever-more knowledge about commonly taken routes.
- Information quality
- Social media
- Text mining
- Transport-related information