MAJORDOME : Assistant personnel et Messagerie unifiée
G. Chollet, L. Likforman, K. Hallouli, N. Azzabou, S.S. Lin, S. Renouard, M. Sigelle, F. Yvon
Journée multimédia - Conseil Scientifique GET - 9/10/2003
Page 2 Journée MM / Conseil Scientifique GET - 9/10/2003
1. Garde la mémoire
2. Communique avec vos interlocuteurs 3. Répond à vos questions
Le MAJORDOME peut être centralisé (serveur
d'entreprise), mobile (sur PDA ou PC-portable) ou
distribué.
Le MAJORDOME est un assistant intelligent personnel qui :
Page 3 Journée MM / Conseil Scientifique GET - 9/10/2003
Majordome is a distributed Personal Digital Assistant
It is your digital slave. It is personal. It remembers everything that you told him.
It uses resources from you mobile (wireless) device, from your home, from your office, from the Internet, from the environment, …
You interact with him using voice, pen, graphics, …
Page 4 Journée MM / Conseil Scientifique GET - 9/10/2003
Interactions with your Majordome
Majordome recognizes your identity, your voice, your handwriting, ...
His speech recognizer is adapted to your voice,
His handwriting recognizer is adapted to your writing style,
He can speak to you, He can display information for you, He can talk with other persons either locally
or over the phone.
Page 5 Journée MM / Conseil Scientifique GET - 9/10/2003
What Majordome does for you ?
Answers your phone, Receives and interpret your faxes, your
emails, … Supplements your memory (address book,
agenda, bookmarks, alarm clock, health record, bank account, documentation, …)
Serves as an interface between you and the (digital) world,
Searches the web, internet forums, … Controls your home, your car, your
children, your parents, …
Page 6 Journée MM / Conseil Scientifique GET - 9/10/2003
- Répond au téléphone,
- Reçoit vos télécopies, - Enregistre et interprète vos messages, - Accède à votre messagerie électronique, - Vérifie votre identité
Le MAJORDOME :
Page 7 Journée MM / Conseil Scientifique GET - 9/10/2003
Majordome’s Functionalities
• Speaker verification
• Dialogue
• Routing
• Updating the agenda
• Automatic summary
Voice
Fax
Page 8 Journée MM / Conseil Scientifique GET - 9/10/2003
Overview of Majordome
Background tasks (server-side only):– sorting and filtering messages from different
sources (E-mail, voice, fax, SMS,…);– extracting relevant information for reporting to user
(names of senders, subject,…).
Dialogue with the user: over phone or Web.– The system presents the state of the mailbox, the
type of messages, their sender, subject, and may sum them up or read them on request;
– The users access their mailbox, addressbook, time schedule, or Web addresses.
Page 9 Journée MM / Conseil Scientifique GET - 9/10/2003
1. Point
– Sous-point
– Sous point
2. Point
– Sous-point
– Sous point 3. Point
– Sous-point
– Sous point
Traitement des télécopies
Page 10 Journée MM / Conseil Scientifique GET - 9/10/2003
1. Point
– Sous-point
– Sous point
2. Point
– Sous-point
– Sous point 3. Point
– Sous-point
– Sous point
Traitement des messages textuels
Page 11 Journée MM / Conseil Scientifique GET - 9/10/2003
Content Extraction in Majordome
• Overall Objective: Quick detection of short information elements for Message Filtering and Reporting to User
• Functional position of this processing phase:– Server-side, event-oriented, background task– subsequent and/or parallel to speech recognition
(voice messages) or image processing (faxes); previous to text summarizing
Page 12 Journée MM / Conseil Scientifique GET - 9/10/2003
Useful applications (1)
• Name/Date/Subject identification (this task specifically useful for fax and voice messages: no standardized fields for storing this information)– “You have 1 fax message from Mrs Diaconu
about ‘attending the Barcelona meeting’…”
• Backup information: user’s addressbook (PABX info yields sender’s phone number)
Page 13 Journée MM / Conseil Scientifique GET - 9/10/2003
Useful applications (2)• Message filtering:
– “You have received 14 personal E-mail messages, among which 3 messages from friends, 6 requests from students or colleagues, and 5 spam messages; you have received 26 mailing list messages, among which 3 call for papers, 11 conference announcements, and 12 other.”
• Backup information: RFC-822 “From” and “Subject” fields.
Page 14 Journée MM / Conseil Scientifique GET - 9/10/2003
Techniques (1)
• Text statistics measures:– Frequency of occurrence of certain
words/morphological categories/syntactical structures in different types of messages
E.g. ratio noun/verb frequency higher in technical texts; style markers specific to some text genres (e.g. frequent use of ‘!’ or ‘$’ in advertisements; ‘loose style’ abbreviations like ‘CU’, ‘IMHO’ in English, or ‘A+’ in French)
Page 15 Journée MM / Conseil Scientifique GET - 9/10/2003
Techniques (2)
• Text skimming:– Spotting “good candidates” for specific word
types (e.g. proper names): selecting capitalized words…
– … comparing with entries in common first names / family names database, and/or…
– … using local grammars to disambiguate other cases.
Page 16 Journée MM / Conseil Scientifique GET - 9/10/2003
Techniques (3)
• Merging visual clues and textual clues for mutual reinforcement of identification probability.
E.g. Probability of an unidentified, capitalized character string to be the proper name of a fax’s sender increases if it stands alone on a line at the top of the image.
Page 17 Journée MM / Conseil Scientifique GET - 9/10/2003
Content Extraction: Current Developments
• Toolbox for text statistics (word frequency, contextual windows, co-occurrence frequency…)
• Tool for determining fuzzy membership to a given class of words
• Tool for determining document language and segmenting multilingual documents
Page 18 Journée MM / Conseil Scientifique GET - 9/10/2003
Content Extraction: Future Developments
• Text categorization module for message sorting and filtering
• Text genre database with (user-controlled) learning capabilities
Page 19 Journée MM / Conseil Scientifique GET - 9/10/2003
Imagepseudo words extraction H/P discrimination
Header candidatesselection
OCR’d version
Logical Pair extraction
sender name extraction
lexicon & logical classes
spatial cues
sender name
Name Block Location in Facsimile Images
Page 20 Journée MM / Conseil Scientifique GET - 9/10/2003
from
to
LP
SR
RR
tofrom
LP
SR RR
Sender and Recipient Image Regions
Page 21 Journée MM / Conseil Scientifique GET - 9/10/2003
Database of Facsimile Images
Campaign for receiving fax images ->30 faxes
Existing database -> 40 faxes
Paper database -> 40 faxes
We ask partners to get also faxes (> 10 each)
Page 22 Journée MM / Conseil Scientifique GET - 9/10/2003
<html><br> You have received a new fax from
<br> Sender name : <IMG SRC="recu006.gif"></html>
HTML File construction (handprinted names)
Page 23 Journée MM / Conseil Scientifique GET - 9/10/2003
Original Image of Fax
Page 24 Journée MM / Conseil Scientifique GET - 9/10/2003
LP
SN
Logical Pair (LP) and Sender Name (SN) location
Page 25 Journée MM / Conseil Scientifique GET - 9/10/2003
1. Reconnaissance de l'appelant
– Noms propres
– Noms eppelés
2. Vérification du locuteur 3. Navigation vocale dans la messagerie
– Sous-point
– Sous point 4. Synthèse vocale
– Sous-point
– Sous point
Interface téléphonique
Page 26 Journée MM / Conseil Scientifique GET - 9/10/2003
Voice technology in Majordome
Server side background tasks:
continuous speech recognition applied to voice messages upon reception– Detection of sender’s name and subject
User interaction:– Identification of the speaker (and Verification if necessary)– Speech recognition (receiving users’ commands through
voice interaction)– Text-to-speech synthesis (reading text summaries, E-mails
or faxes)
Page 27 Journée MM / Conseil Scientifique GET - 9/10/2003
1. Point
– Sous-point
– Sous point
2. Point
– Sous-point
– Sous point 3. Point
– Sous-point
– Sous point
Interface SMS / MMS
Page 28 Journée MM / Conseil Scientifique GET - 9/10/2003
1. Point
– Sous-point
– Sous point
2. Point
– Sous-point
– Sous point 3. Point
– Sous-point
– Sous point
Interface PDA
Page 29 Journée MM / Conseil Scientifique GET - 9/10/2003
1. Point
– Sous-point
– Sous point
2. Point
– Sous-point
– Sous point 3. Point
– Sous-point
– Sous point
Démonstration
Page 30 Journée MM / Conseil Scientifique GET - 9/10/2003
1. Point
– Sous-point
– Sous point
2. Point
– Sous-point
– Sous point 3. Point
– Sous-point
– Sous point
Perspectives