0%

DATA641 Midterm

Q1 + Q6

1.

i.Inflectional morphology

Def:

The part of grammar that deals with the inflections of words. Inflection produces grammatical variants of the same word. Languages that add inflectional morphemes to words are sometimes called inflectional languages. It can be done by serveral ways: affixation, reduplication, alternation, suprasegmental variations. Inflection is most typically realized by adding an inflectional morpheme (that is, affixation) to the base form (either the root or a stem).

Explanation:

good news(n) —> she was good newsed(adj) [some gets good news can be called he is good newsed]

ii.Derivational morphology

Def:

The part of grammar that deals with the derivation of words. Derivation produces a new word by adding affixes to existing words or stem occurs. Affixes are bound morphemes which can only attach to word or stem. The process changes the new semantically and grammatically.

Explanation:

good news(n) —> she was good newsed(v) [sth good news sb]

iii.Constituents

Def:

In syntactic analysis, a constituent is a word or a group of words that function(s) as a single unit within a hierarchical structure. The analysis of constituent structure is associated mainly with phrase structure grammars, although dependency grammars also allow sentence structure to be broken down into constituent parts. The constituent structure of sentences is identified using constituency tests. These tests manipulate some portion of a sentence and based on the result, clues are delivered about the immediate constituent structure of the sentence. Many constituents are phrases. A phrase is a sequence of one or more words (in some theories two or more) built around a head lexical item and working as a unit within a sentence. A word sequence is shown to be a phrase/constituent if it exhibits one or more of the behaviors discussed below.

Explanation:

i talked to my dad’s second cousin [like: sb do something to sb]

iv.Multi-wordunits/collocations

Def:

A multiword is a lexical unit formed by two or more words to yield a new concept, different from the composition of the meaning of its elements.

There are about six main types of collocations: adjective+noun, noun+noun (such as collective nouns), verb+noun, adverb+adjective, verbs+prepositional phrase (phrasal verbs), and verb+adverb.

Explanation:

i talked to my dad’s second cousin in toronto on (good) friday

v.Coreference

Def:

In linguistics, coreference, sometimes written co-reference, occurs when two or more expressions in a text refer to the same person or thing; they have the same referent

Explanation:

i talked to my dad’s second cousin in toronto on good friday and he was just good newsed! (he <—> my dad’s second cousin)

vi.Syntactic ambiguity

Def:

Syntactic ambiguity, also called amphiboly or amphibology, is a situation where a sentence may be interpreted in more than one way due to ambiguous sentence structure.

Explanation:

i talked to my dad’s second cousin in toronto on good friday and he was just good newsed! (Who is good newsed? My dad or my dad’s second cousin)

vii.Number agreement

Def:

agreement in number between words in the same grammatical construction (e.g., between adjectives and the nouns they modify)

Explanation:

i talked to my dad’s second cousin in toronto on good friday and he was just good newsed! (“he” is single, so “was” is used)

viii.“Infinite capacity from finite means”

Def:

Some information can indicated by the sentence we saw, even though it is said in the sentence.

Explanation:

i talked to my dad’s second cousin in toronto on good friday and he was just good newsed! —>

Her dad has cousins more than two.

Toronto on that Friday had a good weather.

Her dad’s second cousin was in toronto on that Friday.


6

1. True

Susan and Mary enjoyed the spaghetti they served with her brother.

There are only two forms of ambiguity: lexical ambiguity and syntactic ambiguity.

For this sentence, there are three ambiguities:

  • Who served the spaghetti?
  • Whose brother? Susan’s or Mary’s?
  • Did the brother eat the spaghetti? Or he just went to accompany those two girls?

2. False

Zipf’s law is a relation between rank order and frequency of occurrence: it states that when observations (e.g., words) are ranked by their frequency, the frequency of a particular observation is inversely proportional to its rank, Frequency ∝ 1 Rank .

3. False

Byte pair encoding or digram coding is a simple form of data compression in which the most common pair of consecutive bytes of data is replaced with a byte that does not occur within that data. A table of the replacements is required to rebuild the original data.

4. True

Word2vec is a group of related models that are used to produce word embeddings. These models are shallow, two-layer neural networks that are trained to reconstruct linguistic contexts of words. Word2vec takes as its input a large corpus of text and produces a vector space, typically of several hundred dimensions, with each unique word in the corpus being assigned a corresponding vector in the space. Word vectors are positioned in the vector space such that words that share common contexts in the corpus are located close to one another in the space.

Each word will not be represented by a discrete and sparse vector, but by a d-dimension continuous vector, and the meaning of each word will be captured by its relation to other words

5. True

“frame” + “er” + “s” = “framers”

framer —>framer —>framers

(ex: drive —>driver —> drivers / office —>officer —>officers)

6.

7.False

笔记 2022年3月7日 19