« Crossroads of Speech and Language »


AI Data Innovations G10

AI Data Innovations is a highly experienced and trusted machine learning data provider with resources around the world. We developed our reputation for excellence by serving large global companies with significant international requirements. We support machine learning through data collection, annotation, classification, and speech/text/robotics support. With a wide variety of secure solutions, we can provide sensitive, high-quality data in the time-frames that you need. And our innovative approaches ensure that our partners get the industry’s best approaches to quality and efficiency.

Alibaba Group F3

We enable businesses to transform the way they market, sell and operate and improve their efficiencies. We provide the technology infrastructure and marketing reach to help merchants, brands and other businesses to leverage the power of new technology to engage with their users and customers and operate in a more efficient way.

Amazon G1

Amazon Alexa is proud to be the Founding Sponsor of Interspeech 2019. Our mission: make Alexa the most trusted, competent and natural personal assistant and companion for our customers worldwide. Alexa first launched on the Amazon Echo nearly five years ago, in November 2014. Since that time, Alexa scientists and engineers have focused relentlessly on making Alexa as natural to engage with as speaking to a human. Science is critical to how Alexa is revolutionizing daily conveniences, from playing music and controlling your smart home, to getting information and much more just by using your voice. Our scientists and engineers develop foundational AI technologies for anyone to build intelligent conversational interfaces for any device, application, language or environment. We build machine learning algorithms, services and data-driven models for all the key components that contribute to the magic that is Alexa. We invite you to learn more at Alexa Science.

Appen G4

Appen collects and labels images, text, speech, audio, and video used to build and continuously improve the world’s most innovative artificial intelligence systems.

With expertise in more than 180 languages, a global crowd of over 1 million skilled contractors, and the industry’s most advanced AI-assisted data annotation platform, Appen solutions provide the quality, security, and speed required by leaders in technology, automotive, financial services, retail, manufacturing, and governments worldwide.

Founded in 1996, Appen has customers and offices around the world.

Apple G7

We’re a diverse collection of people, reimagining what’s possible to help us do what we love in new ways. The people who work here have reinvented entire industries with the Mac, iPhone, iPad, and Apple Watch, and with services, including iTunes, the App Store, Apple Music, and Apple Pay.

Employees in machine learning and AI are building amazing experiences into every Apple product, allowing millions to do what they never imagined. Because Apple fully integrates hardware and software across every device, these researchers and engineers collaborate more effectively to improve the user experience while protecting user data. Come make an impact with the products you create and the research you publish.



ASAPP builds machine-learning products for enterprise-level problems, where both cost-efficiency and productivity are at stake. Our approach unites human work with AI Native/ML products that have been meticulously developed by the best minds in their field. We make interactions easier, more efficient, and more enjoyable for employees and customers alike. 

The Speech Systems team at ASAPP provides world-class speech services to enable ASAPP’s AI-Native products to understand conversations between humans, and between humans and machines. We do this by pushing the limits of state-of-the-art applied research on speech, through joint optimization with our proprietary NLP technology, innovative engineering and highly collaborative teamwork.

Auphonic F5

Auphonic develops intelligent audio algorithms using a combination of machine learning, music information retrieval and signal processing to create automatic audio post production software for podcasts, broadcasters, movies, lecture recordings, audio books and more.

Our users just upload their recorded audio and Auphonic will do the rest: neither complicated parameter settings nor audio expert knowledge is necessary and our algorithms keep learning and adapting to new data every day!

Auphonic overview: https://auphonic.com
Audio examples of our algorithms: https://auphonic.com/audio_examples


AVL is the world's largest independent company for the development, simulation and testing of powertrain systems for passenger cars, commercial vehicles, construction, large engines and their integration into the vehicle. As a global technology leader, AVL provides complete and integrated development environments, measurement and test systems as well as state-of-the-art simulation methods. As a pioneer in the field of innovative solutions, such as diverse electrification strategies for powertrains, AVL is increasingly taking on new tasks in the field of autonomous driving, especially on the basis of subjective human sensations (driveability, connectivity, ADAS, ...). In the competition of technologies – internal combustion engine, battery/electric drive and fuel cell – and their combinations, AVL is working intensively and with the same priorities. AVL has digitized the vehicle development process with state-of-the-art and highly scalable IT, software and technology platforms, and creates new customer solutions in the areas of big data, artificial intelligence, simulation and embedded systems in an agile and integrated development environment.

Beijing Huiting Technology Co., Ltd F2

Beijing Huiting Technology Corporation is a professional multimedia data service provider. Our corpus production team has more than ten years of experiences in providing high quality corpus for speech recognition, speech synthesis, speech quality evaluation, and multimedia research. Our corpus production team is supported by an outstanding research and development team, comprised by the researchers and engineers from the Chinese Academy of Sciences Research Institutes and top universities. Our corpus is recorded in our ITU standardized recording chamber with professional equipment. During the corpus production process, we supervised the whole workflow ensuring a flawless labeling process for our high quality data services.

Carstens Medizinelektronik GmbH F9

1250 Hz data sampling rate - 250 Hz / 1250 Hz position result
High temporal and spatial resolution, therefore very suitable for recording small or fast speech movements
Dynamic positional accuracy 0.3 mm RMS, valid for 100 % of all data
Integrated signals for synchronisation of sound or other devices
Real-time display during recording
Up to 24 channels available
All-in-one-system with a stand, calibration system, computer, various software, acoustic equipment
Individual service and a strong community
Possible to work synchronously with two AG501- Dual system
Multi usable sensors - flexible shielded cable - easy to plug in
Reliable research on speech for scientists in the fields of phonetics and forensic, neurology, neuro-physiology and dentistry.
Non-articulatory research regards grasping movements as can be found in sign language.
Suitable to measure dimensions of structures like palate or lower jaw.

Databaker Technology G6

Founded in February 2016, Databaker has a professional team to supply data annotation, collection service and TTS solution for AI.

With the advantage of data processing technology, we are committed to providing customers with fast and accurate professional data services to maximize data value and promote innovation in technology, application and industry.

Datatang Technology INC. G17

Datatang is a trusted data pre-processing company. We are engaged in data collecting, annotating, and customizing to meet our clients’ various needs. We assist our clients from university research labs and company R&D departments to waive trivial yet necessary data processing procedure of images, voices, and texts and make their approach to the highest-value data in a more efficient way.

We’re keeping long-term cooperation with our clients, and view their success as ours. With collaborations, we believe that both clients and Datatang are better-off than either of us working alone. Our clients includes Fortune 500 companies, but we also work with mid-sized cooperations, universities, non-profit organizations, and government agencies.

DefinedCrowd G13

DefinedCrowd offers an intelligent data infrastructure for AI, that provides high-quality training data to help machine learning oriented products reach market quicker and with better quality. With strong expertise in speech and natural language processing technologies, we support a broad range of use cases, like virtual assistants, customer review and satisfaction, autonomous vehicles, content categorization, pattern recognition or even surveillance systems. Our platform offers efficient data workflows that enable data scientists to collect, synthesize, enrich and structure training data, by combining human intelligence, automatic tools, and machine learning capabilities to accelerate enterprise AI initiatives. We believe the full potential of an AI model is framed by the quality of the training data fed. DefinedCrowd empowers data scientists globally to compete and differentiate by providing an on-demand solution for high quality training data at their fingertips, with 95-98% of quality, speed (5-10x faster thancompetition) and scale (50+ languages covered).

Dialpad Inc. F11

At Dialpad, we use AI to help businesses make smarter calls. Our Voice Intelligence team builds custom speech recognition and natural language processing models to generate insights and real-time recommentdations to improve conversations.

DiDi F6

Didi Chuxing (“DiDi”) is the world’s leading mobile transportation platform. The company offers a full range of app-based transportation options for 550 million users, including Taxi, Express, Premier, Luxe, Bus, Designated Driving, Enterprise Solutions, Bike Sharing, E-bike Sharing, Car Rental and Sharing and food delivery. Tens of millions of drivers who find flexible work opportunities on the DiDi platform provide 10 billion passenger trips a year.

DiDi is committed to collaborating with policymakers, the taxi industry and communities to solve the world’s transportation, environmental and employment challenges with smart transportation innovations.

DiDi partners with Grab, Lyft, Ola, 99 and Bolt (Taxify) in a global ride-hailing network that reaches over 80% of the world’s population across over 1,000 cities. Currently, DiDi provides ride-hailing services in Brazil under the 99 brand, operates DiDi-branded mobility services in Mexico, Chile, Colombia and Australia, and provides taxi-hailing service in Japan through a joint venture. By continuously improving user experience and creating social value, DiDi strives to build a safe, open and sustainable mobile transportation ecosystem.


Founded in 1995, ELRA, the European Language Resources Association, is a non-profit organization whose main mission is to make Language Resources (LRs) for Human Language Technologies (HLT) available to the community at large. To achieve this goal, ELRA carries out a wide variety of activities around LRs. ELRA acts as a data center for distributing LRs for free or for a fee, depending on the requirements of data right holders. ELRA acts also as a production center for all types of resources and languages, from production of raw data to data annotation, transcription, validation, etc. ELRA offers a helpdesk covering all issues related to LR use, in particular for clearing all IPRs (Intellectual Property Rights) and drafting/customizing licenses.
ELRA is also the organizer of LREC series of conferences, the major event on LRs and Evaluation of technologies; LREC 2020 will be held in Marseille, France in May 2020.

Facebook Artificial Intelligence F7

Bringing the world closer together by advancing artificial intelligence

At Facebook AI, we're connecting people to what they care about, powering new, meaningful experiences, and advancing the state-of-the-art through open research and accessible tooling.

Our teams accelerate research breakthroughs across both existing and new learning paradigms to develop state-of-the-art AI that has a positive impact on people and society.

Fano Labs Limited F4

Headquartered in Hong Kong, Fano Labs is one of the best AI companies in Greater China; with specialists focusing on AI technologies including Automatic Speech Recognition (ASR), Natural Language Processing (NLP) and Big Data Technologies to help enterprises with customer services, compliance and other lines of businesses.
We have an in-house research team composed of professors and PhDs from prestigious universities, such as HKU, MIT, UCB, and NUS, who are in the forefront of advanced speech and NLP research. Our team consists of experienced veterans in artificial intelligence, making it possible to transform the coolest technologies into great products and solutions.
With the combination of Fano Labs’ technical focus in a variety of languages, dialects, minor languages, and mixed languages, and the professional services with domain-specific knowledge, we provide AI technologies and solutions to clients from various industries, including Banking, Telecom, Government and more.

Furhat Robotics G22

Furhat is the world's most advanced social robot that can interact with humans in the same way we interact with each other, by speaking, listening, showing emotions and maintaining eye contact. The minimalistic form factor belies a machine with extraordinary capabilities that can solve problems beyond the capacity of any other modern day computer interface. Furhat’s back-projection system makes it the most customizable robot in the world, allowing it to be used in a wide variety of research applications.

Gnani.ai F14

Gnani.ai is an AI-based Speech Recognition and Natural Language Processing company backed by Samsung Ventures Investment Corporation. Gnani.ai is focused on technologies like Speech Recognition, NLP and text-to-speech and provides solutions around these core technologies. With support to multiple languages, Gnani.ai also has dedicated libraries for industry specific use cases. Gnani.ai also provides customised annotation services on its scalable annotation platform. From an applications perpective, Gnani.ai provides solutions like domain specific voice assistants and omnichannel analytics on multiple channels like mobile, web and telephone.

Google G8

Our mission is to organize the world’s information and make it universally accessible and useful, and AI is enabling us to do that in incredible new ways - solving problems for our users, our customers, and the world. AI makes it easier for you to do things every day, whether it’s searching for photos of people you love, breaking down language barriers, or helping you get things done with your own personal digital assistant. But it’s also providing us with new ways of looking at old problems and helping transform how we work and live, and we think the biggest impact will come when everyone can access it.


The 21st INTERSPEECH will be staged in Shanghai International Convention Center, on September 14-18, 2020.
Shanghai is situated on the banks of the Yangtze River Delta in Eastern China. A series of international academic conferences, such as, PIC2016, ICASSP2016…have been held successfully here. Facilitated with all the necessary resources and as the most important cultural, commercial centers of China, Shanghai will offer a great deal of history and culture to the delegates.
The theme of INTERSPEECH 2020 is Cognitive Intelligence for Speech Processing. A series of sessions, such as, plenary talks, tutorials, special events and exhibits will be organized around this topic. We will invite speech and language scientists, researchers and engineers from all over the world to explore, investigate and discuss this important issue from both theory and practice. We sincerely hope Interspeech 2020 will be a highly positive scientific, social and aesthetical experience for all the participants.


INTERSPEECH 2021 will be held in Brno, Czechia, from August 30 to September 9 2021.


ISCA is a non-profit organization deposited in France.
The purpose of the association is to promote, in an international world-wide context, activities and exchanges in all fields related to speech communication science and technology. The association is aimed at all persons and institutions interested in fundamental research and technological development that aims at describing, explaining and reproducing the various aspects of human communication by speech, that is, without assuming this enumeration to be exhaustive, phonetics, linguistics, computer speech recognition and synthesis, speech compression, speaker recognition, aids to medical diagnosis of voice pathologies.

Linguistic Data Consortium F16

The Linguistic Data Consortium is a non-profit organization hosted by the University of Pennsylvania whose mission is to support language-related education, research and technology development by creating and sharing linguistic resources, such as data, tools and standards. After more than 25 years as the leader in language resource development and distribution, LDC continues to provide to the community large quantities of diverse data, research support and high quality membership services. Human language technology development, artificial intelligence and related fields are changing rapidly and need effective digital resource delivery, greater language coverage, new data genres, fast, cost-efficient annotation processes and flexible tools. The Consortium successfully meets those challenges and will continue to do so with the support of members, licensees, sponsors and collaborators.

Magic Data Technology Co., Ltd F15

Magic Data Technology Co., Ltd. is one of the leading artificial intelligence data service providers in the world.
We are committed to providing a wide range of customized data services in the fields of speech recognition, intelligent imaging and NLU.
Through rigorous quality control measures, we strive to offer top-notch, individualized training data sets to meet our clients’ distinctive needs. Magic Data provides both pre-annotated training datasets available for immediate purchase and customized data solutions, including human-in-the-loop exception handling. Magic Data employs a vast team of skilled data specialists and has a wide network of consultants around the world to assist with specific data needs.

MathWorks F13

MATLAB and Simulink are fundamental computation tools used at more the 5,000 educational intuitions worldwide. MATLAB is one of the top 10 most popular programming languages and is used for teaching, research, and project-based learning. Add MATLAB and Simulink to the classroom to inspire critical thinking and innovation as well as prepare students for prominent careers in industry, where the tools are the de facto standard for R&D. Learn more at mathworks.com

Microsoft Corporation G16

Microsoft Research is where leading scientists and engineers have the freedom and support to tackle complex problems that propel innovation and improve lives. aka.ms/interspeech-2019


NAVER is the most dominant web portal over the last two decades powered by search engine, webtoons, shopping, video and other Internet-related services and products.
Based in Japan, LINE Corporation is a global smart portal with over 200 million monthly users in more than 230 regions.

Northern Digital Inc. (NDI) G12

NDI is a global-leading innovator and manufacturer of 3D measurement and motion tracking solutions. Our Vox-EMA (electromagnetic articulography) tracking system captures dynamic orofacial and articulatory movements that can be used for the research and rehabilitation of speech pathologies. Micro sensors attached to the tongue, lips, jaw, and face record 5/6DOF movements without occlusions. Up to 16 5DOF sensors can be used at once to produce a detailed model of transpired movement. Minimal latency and noise, combined with high temporal and spatial accuracy, ensure the fastest and most subtle of movements are captured the instant they occur. With its intelligent hardware design, seamless audio synchronization, and intuitive software, the Vox-EMA put the focus on fast, accurate data collection -- not lengthy setup or post-processing.

Nuance Communications F8

It’s hard to get through a day without experiencing Nuance. We lead innovation in intuitive, award-winning conversational AI that adapts to each business and every unique situation. Our solutions don’t just hear and speak. They understand, analyze, anticipate, reason and resolve. We don’t just make artificial intelligence: we make sure it works the way you want it to. Nuance’s R&D team of talented scientists, linguists and engineers pioneers innovation in human-machine intelligence and communication, resulting in groundbreaking strides in voice, language understanding and AI that redefine how people and technology co-exist.


Phonexia is the only speech technology software manufacturer that reveals and leverages the most data in speech for enterprising trailblazers across the globe who want to discover and develop powerful new skills in a knowledge-based economy.

Raytheon BBN Technologies G14

Raytheon BBN Technologies' Analytics and Machine Intelligence group is a world leader in machine learning and artificial intelligence research and advanced solutions. With a 40-year history of solving difficult problems for demanding customers, we have advanced the state of the art in Speech Recognition, Machine Translation, and Information Extraction. Our success comes from YOU: please stop by our booth to see what we are doing and find out what positions are available!

Speechocean F18

Since its establishment as an AI data resource provider, Speechocean has always devoted itself to providing specialized engineering data products and services to enterprises and scientific research institutions in the whole industry chain of AI. Our business involves various domains such as speech recognition, speech synthesis, computer vision, lexicon, and natural language processing and provides relevant services for the design, collection, transcription, annotation, etc. of data.

The products and services of Speechocean have been applied to various AI domains, such as smart home, smart speaker, autopilot, and vehicle navigation. Our customers cover major large technology companies, AI enterprises and scientific research institutes in the world.

Speechocean has global business support and delivery capabilities and its product line include 130+ languages and dialects in 70+ countries and regions.

Surfing Technology Beijing Co. Ltd G9

Surfing Technology Beijing Co., Ltd is a data provider that supplies DaaS (Data as a Service) platform for AI companies.

With a deep understanding of the data-related pain points in the development of AI algorithms, we independently designed and processed data in scientific ways, producing high-value speech recognition and face recognition datasets optimized for training AI speech and facial algorithms.

Our data solutions have been verified by top companies worldwide. Furthermore, we are always optimizing and upgrading our data solutions in order to provide the best value and services to our clients.

Zen3 Infosolutions America Inc G21

Zen3 is an AI-first data services organization. Our work with global technology leaders help deliver services used by over a billion people. Our Data Services Group (DSG) helps in the collection, annotation, translation and transcription, labelling and analysis of speech data for some of the most complex machine learning programs, including 2 of the top 4 speech assistants used globally.

With over 1300+ data experts, linguists, language exerts, annotators and technology exports we have delivered over 150 million data tasks and over 10K hours of speech data. We are 100% PII compliant and have a data quality acceptance and accuracy levels of over 90%.

Our services are powered by DataMime, a custom end-to-end speech collection, labeling and annotation tool with a framework to ensure high quality and people management processes.

The Zen3 DSG works to deliver other media types – image, video, text and maps – apart from speech.