15:22 Semantic cognitive code learning theory frames and word embeddings at google - go fish digital | |
Word embeddings are a way for google to look at text, whether a short tweet or query, or a page, or a site, and understand the words in those better.Cognitive code learning theory it can understand when a word or a sentence could be added, which is how query rewriting under something like rankbrain takes place. But the word embedding approach doesn’t understand the context of words, like the difference between a river bank or withdrawing money from a bank account.Cognitive code learning theory so, google has been working on exploring ways to pre-train text, so that not only can this natural language processing approach understand what might be missing, but possibly so that contexts and meanings of words can be better understood.Cognitive code learning theory When I worked for the largest trial court in the state of delaware, there were terms that we used that everyone working in the court knew the meaning of, but weren’t words that most people would see in normal conversations, such as capias (a bench warrant issued by a judge) or nolle pros’d (a notice of nolle prosequi filed by a deputy attorney general stating that they decided not to prosecute a charge that had been indicted or brought on a warrant by a police officer.) these words can mean that someone may end up being locked up, or released from jail or a prison, and are part of the everyday framework of language for people who work in a court system.Cognitive code learning theory when those words are explained in the context of a frame, such as the criminal justice system, they gain a lot of meaning. Frame semantics has been part of something known as computational linguistics for over 20 years.Cognitive code learning theory it appears to be something that google will be working into some of the more recent technology that they have been coming up with, like the word vectors, or word embeddings that are behind technology such as their rankbrain update.Cognitive code learning theory before I talk about a google patent that introduces that, I think it’s important and essential to look more at what frame semantics is, and how it works.Cognitive code learning theory The criminal justice system I start this post off with is a conceptual frame that gives words such as capias and nolle pros’d meaning. Without having been in that world, I wouldn’t understand them.Cognitive code learning theory I also wouldn’t know what the difference between prison and jail was either, and a jail is a place where someone is held before they may have been tried and convicted of a criminal offense, and a prison is where they are sent after a trial and sentencing.Cognitive code learning theory when someone uses either word and means the other, I know that they haven’t worked in the criminal justice system; that frame is outside of their experience.Cognitive code learning theory In addition to looking at the framenet project pages, it is rare seeing a google patent in which the inventors behind a patent have written a whitepaper on the same topic.Cognitive code learning theory I’ve seen this done with many patents from microsoft, but only a handful from google. In this case, there is one that is worth spending some time with.Cognitive code learning theory the paper is semantic frame identification with distributed word representations. I explained how working at delaware courts gave me an understanding of words that were commonly used at a courthouse that often meant a difference between people being locked up or released from prison, but which most people wouldn’t understand.Cognitive code learning theory in the video on cognitive linguistics we were told about the frame of someone waiting on customers in a restaurant, and how words come from that frame.Cognitive code learning theory the frame of commercial buying was also mentioned and is illustrated in a screenshot from the google patent. We are shown how such language might be annotated under that frame: cognitive code learning theory Many patents are filled with definitions, and this new one from google is no different. While I have provided some examples and a definition of what frame semantics are, and a couple of videos about it, looking at google’s definition from the patent is worth doing because they provide context for how they may be used in the process that their patent is about.Cognitive code learning theory here is how they define frame semantics: Linguistic semantics focuses on the history of how words have been used in the past. Frame semantics is a theory of language meaning that relates linguistic utterances to word knowledge, such as event types and their participants.Cognitive code learning theory A semantic frame refers to a collection of facts or a coherent structure of related concepts that specify features (attributes, functions, interactions, etc.) that are typically associated with the specific word.Cognitive code learning theory one example semantic frame is the situation of a commercial transfer or transaction, which can involve a seller, a buyer, goods, and other related things.Cognitive code learning theory A computer-implemented technique is presented. The technique can include receiving, at a server having one or more processors, labeled training data including a plurality of groups of words, each group of words having a predicate word, each word having generic word embeddings.Cognitive code learning theory the technique can include extracting, at the server, the plurality of groups of words in a syntactic context of their predicate words. The technique can include concatenating, at the server, the generic word embeddings to create a high dimensional vector space representing features for each word.Cognitive code learning theory the technique can include obtaining, at the server, a model having a learned mapping from the high dimensional vector space to a low dimensional vector space and learned embeddings for each possible semantic frame in the low dimensional vector space.Cognitive code learning theory the technique can also include outputting, by the server, the model for storage, the model is configured to identify a specific semantic frame for an input.Cognitive code learning theory In other embodiments, the technique further includes: receiving, at the server, speech input representing a question, converting, at the server, the speech input to a text, analyzing, at the server, the text using the model, and generating and outputting, by the server, an answer to the question based on the analyzing of the text using the model.Cognitive code learning theory translation In some embodiments, the technique further includes: receiving, at the server, a text to be translated from a source language to a target language, the source language being a same language as a language associated with the model, analyzing, at the server, the text using the model, and generating and outputting, by the server, a translation of the text from the source language to the target language based on the analyzing of the text using the model.Cognitive code learning theory search results A computer-implemented technique can include receiving, at a server, labeled training data including a plurality of groups of words, each group of words having a predicate word, each word having generic word embeddings.Cognitive code learning theory the technique can include extracting, at the server, the plurality of groups of words in a syntactic context of their predicate words. The technique can include concatenating, at the server, the generic word embeddings to create a high dimensional vector space representing features for each word.Cognitive code learning theory the technique can include obtaining, at the server, a model having a learned mapping from the high dimensional vector space to a low dimensional vector space and learned embeddings for each possible semantic frame in the low dimensional vector space.Cognitive code learning theory the technique can also include outputting, by the server, the model for storage, the model is configured to identify a specific semantic frame for an input.Cognitive code learning theory is google in a post semantic frames time? We present a resource for the task of framenet semantic frame disambiguation of over 5,000 word-sentence pairs from the wikipedia corpus.Cognitive code learning theory the annotations were collected using a novel crowdsourcing approach with multiple workers per sentence to capture interannotator disagreement.Cognitive code learning theory in contrast to the typical approach of attributing the best single frame to each word, we provide a list of frames with disagreement-based scores that express the confidence with which each frame applies to the word.Cognitive code learning theory this is based on the idea that inter-annotator disagreement is at least partly caused by the ambiguity that is inherent to the text and frames.Cognitive code learning theory we have found many examples where the semantics of individual frames overlap sufficiently to make them acceptable alternatives for interpreting a sentence.Cognitive code learning theory we have argued that ignoring this ambiguity creates an overly arbitrary target for training and evaluating natural language processing systems – if humans cannot agree, why would we expect the correct answer from a machine to be any different?Cognitive code learning theory to process this data we also utilized an expanded lemma-set provided by the framester system, which merges FN with wordnet to enhance coverage.Cognitive code learning theory our dataset includes annotations of 1,000 sentence-word pairs whose lemmas are not part of FN. Finally, we present metrics for evaluating frame disambiguation systems that account for ambiguit cognitive code learning theory Pre-trained text encoders have rapidly advanced the state of the art on many NLP tasks. We focus on one such model, BERT, and aim to quantify where linguistic information is captured within the network.Cognitive code learning theory we find that the model represents the steps of the traditional NLP pipeline in an interpretable and localizable way and that the regions responsible for each step appear in the expected sequence: POS tagging, parsing, NER, semantic roles, then coreference.Cognitive code learning theory qualitative analysis reveals that the model can and often does adjust this pipeline dynamically, revising lowerlevel decisions on the basis of disambiguating information from higher-level representations.Cognitive code learning theory I’ve summarized the summary of this patent, but looking at it, and what has come after it, it might be worth skipping ahead in time, to see some of the other things that google is working upon.Cognitive code learning theory the detailed description of this patent provides more details about how it works, however one of the inventors of this semantic frames patent, and author of the related white paper (dipanjan das) is an author of a more recent paper at google around BERT as well, which appears to be creating a buzz around the search industry (the classical NLP pipeline paper I linked to above.) the semantic frames patent is an updated continuation patent for a patent that was originally filed on may 7, 2014.Cognitive code learning theory knowing about semantic frames and how it could potentially be used is helpful, especially understanding how it aims at giving context to words being processed.Cognitive code learning theory | |
|
Total comments: 0 | |