MMML-Tutorial-Alignment

June 28, 2022 | 0:23:24

Transcript translations are generated by AI.
Please excuse mistakes in translation.

Created: June 28, 2022




	00:00	(Beginning of video)

	00:00	Another key technical challenges and multimodal is alignment.

	00:06	Alignment comes in many different flavors and many different application and like almost all no problems will have an aspect of alignment so for example in maybe a description like your image captioning you want to know which object related to which words or in a video which lnb which elements are which events related to which a phrase.

	00:31	Part of a video caption modality transcription like text to speech or even like a generating gestures from speech will also require this alignment because gestures are not aligned.

	00:48	And in many of these new application like navigation question-answering you need to have like where's the phone so you need to find the phone but you need also to have some contacts related to that.

	23:24	(End of video)