Improving conversations with digital assistants through extracting, recommending, and verifying user inputs

Sarah A Burke, Shauna Logan, Larissa C Maksi

Abstract


Digital assistants, including chat bots and voice assistants, suffer from discrepancies and uncertainty in human text and speech inputs. Human dialogue is often varied, ambiguous, and inconsistent, making data entry prone to error and difficult for digital assistants to process. Finding and extracting pertinent information from unstructured user inputs improves and expands the use of digital assistants on any platform. By confirming data entries and providing relevant recommendations when invalid information is provided, the digital assistant enables the use of natural language and introduces a higher degree of flow into the conversation.

This paper describes a series of input logic codifiers that form a corrective method to overcome errors and ambiguity typical of voice and text inputs. When users make a common mistake or forget data, the digital assistant can bridge the gap by recommending the most similar data that is available. The assistant measures the delta between the user’s utterance and valid entries using fuzzy logic to identify the closest and next closest data that relates to the unstructured text.

Furthermore, there are endless ways to denote dates, locations, etc., making it difficult for digital assistants to extract accurate and relevant data from the user’s natural language. However, the assistant may infer the desired data format or reference from the dialogue provided and validate this with the user as a follow-on question. The desired data format or type is inferred using fuzzy extraction methods, such as fuzzy date extraction, to isolate the desired data format from the unstructured text. This extracted information is then verified or confirmed by the user to maintain data accuracy and avoid downstream data quality issues.

Keywords


digital assistant; fuzzy logic; voice assistant; chat bot; natural language processing

Full Text:

PDF

References


K. H. Hicks, “Department of Defense Software Modernization,” US Department of Defense, 02-Feb-2022. [Online]. Available: https://media.defense.gov/2022/Feb/03/2002932833/-1/-1/1/DEPARTMENT-OF-DEFENSE-SOFTWARE-MODERNIZATION-STRATEGY.PDF. [Accessed: 27-Jun-2022].

Kupzyk, K. A., & Cohen, M. Z. (2015). Data validation and other strategies for data entry. Western journal of nursing research, 37(4), 546-556.

Lazar, J., Jones, A., Hackley, M., & Shneiderman, B. (2006). Severity and impact of computer user frustration: A comparison of student and workplace users. Interacting with Computers, 18(2), 187-207.

Meyer, D. E., & Kieras, D. E. (1997). A computational theory of executive cognitive processes and multiple-task performance: Part I. Basic mechanisms. Psychological review, 104(1), 3.

Ruan, S., Wobbrock, J. O., Liou, K., Ng, A., & Landay, J. A. (2018). Comparing speech and keyboard text entry for short messages in two languages on touchscreen phones. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 1(4), 1-23.

“U.S. Voice assistant users 2017-2022,” Statista, 18-Mar-2022. [Online]. Available: https://www.statista.com/statistics/1029573/us-voice-assistant-users/. [Accessed: 27-Jun-2022].

Maedche, A., Legner, C., Benlian, A., Berger, B., Gimpel, H., Hess, T., ... & Söllner, M. (2019). AI-based digital assistants: Opportunities, threats, and research perspectives. Business & Information Systems Engineering, 61, 535-544.

Swerts, M., Litman, D. J., & Hirschberg, J. (2000, October). Corrections in spoken dialogue systems. In INTERSPEECH (pp. 615-618).

D. Pal, C. Arpnikanondt, S. Funilkul and V. Varadarajan, "User Experience with Smart Voice Assistants: The Accent Perspective," 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Kanpur, India, 2019, pp. 1-6, doi: 10.1109/ICCCNT45670.2019.8944754.

Sokol, N., Chen, E. & Donmez, B (2017) “Voice-Controlled In-Vehicle Systems: Effects of Voice-Recognition Accuracy in the Presence of Background Noise”. Driving Assessment Conference. 9(2017). doi: https://doi.org/10.17077/drivingassessment.1629

Munster, G., & Thompson, W. (2019). Annual digital assistant IQ test. Recuperado de https://loupventures. com/annual-digital-assistant-iq-test.

A Al-Karawi, K., H Al-Noori, A., Li, F. F., & Ritchings, T. (2015). Automatic speaker recognition system in adverse conditions—implication of noise and reverberation on system performance. International Journal of Information and Electronics Engineering, 5(6), 423-427.

Hasal, M., Nowaková, J., Ahmed Saghair, K., Abdulla, H., Snášel, V., & Ogiela, L. (2021). Chatbots: Security, privacy, data protection, and social aspects. Concurrency and Computation: Practice and Experience, 33(19), e6426.

Wellsandt, S., Hribernik, K., & Thoben, K. D. (2021). Anatomy of a digital assistant. In Advances in Production Management Systems. Artificial Intelligence for Sustainable and Resilient Production Systems: IFIP WG 5.7 International Conference, APMS 2021, Nantes, France, September 5–9, 2021, Proceedings, Part IV (pp. 321-330). Springer International Publishing.

Chuang, H. M., & Cheng, D. W. (2022). Conversational AI over Military Scenarios Using Intent Detection and Response Generation. Applied Sciences, 12(5), 2494.

He, T., Xu, X., Wu, Y., Wang, H., & Chen, J. (2021). Multitask learning with knowledge base for joint intent detection and slot filling. Applied Sciences, 11(11), 4887.

Razzouk, R., & Shute, V. (2012). What is design thinking and why is it important?. Review of educational research, 82(3), 330-348.

Klatt, D. H. (1987). Review of text‐to‐speech conversion for English. The Journal of the Acoustical Society of America, 82(3), 737-793.

Trivedi, A., Pant, N., Shah, P., Sonik, S., & Agrawal, S. (2018). Speech to text and text to speech recognition systems-Areview. IOSR J. Comput. Eng, 20(2), 36-43.

Borau, S., Otterbring, T., Laporte, S., & Fosso Wamba, S. (2021). The most human bot: Female gendering increases humanness perceptions of bots and acceptance of AI. Psychology & Marketing, 38(7), 1052-1068.

Pereda, R., & Taghva, K. (2011, April). Fuzzy information extraction on OCR text. In 2011 Eighth International Conference on Information Technology: New Generations (pp. 543-546). IEEE.

Wang, Y., Qin, J., & Wang, W. (2017, October). Efficient approximate entity matching using jaro-winkler distance. In Web Information Systems Engineering–WISE 2017: 18th International Conference, Puschino, Russia, October 7-11, 2017, Proceedings, Part I (pp. 231-239). Cham: Springer International Publishing.

Read the Docs. “dateutil”. Built with Sphinx using a theme provided by Read the Docs. © Copyright 2019, dateutil Revision 6b035517. Available: https://dateutil.readthedocs.io/en/stable/parser.html.

Read the Docs. “mycroft.util.parse”. Built with Sphinx using a theme provided by Read the Docs. © Copyright 2017, Mycroft AI Inc. Revision a909fc8f. Available: https://mycroft-core.readthedocs.io/en/latest/source/mycroft.util.parse.html.

Wei, Z., & Landay, J. A. (2018). Evaluating speech-based smart devices using new usability heuristics. IEEE Pervasive Computing, 17(2), 84-96.




DOI: https://doi.org/10.23954/osj.v8i2.3402

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Open Science Journal (OSJ) is multidisciplinary Open Access journal. We accept scientifically rigorous research, regardless of novelty. OSJ broad scope provides a platform to publish original research in all areas of sciences, including interdisciplinary and replication studies as well as negative results.