Methods, systems and software for generating sentences,
and visual and audio compositions representing said sentences
The present invention provides methods, systems and software for: rule-based transforming of an initial thesaurus-like organized input lexicon into a rule-based lexical ontology containing inflectional forms of words represented by means of syntactical-complements-joining sets the elements of which are words that act as hidden semantic intermediaries mediating the syntactic pairing of words in a sentence; rule-based generating of sentences from sequences of interdependent instructions for syntactic pairing of grammatically correct inflectional forms of words having at least one semantic intermediary common to specified by said instructions pairs of syntactical-complements-joining sets representing said pairs of grammatically correct inflectional forms in said rule-based ontology; and, generating of visual and audio compositions representing said generated sentences.
This invention relates to the field of computational creativity in the areas of natural-language generation (NLG) and the arts.
The goal of computational creativity is to design computer programs that model, simulate or replicate human creativity understood as an ability to create novel representations of pre-existing ideas or objects, said novel representations arising through rule-based, statistical based or inspired by the functioning of biological systems transformations of initial and intermediate representations.
Examples of developed to date systems in the area of arts include Harold Cohen's AARON and the NEvAr system (for "Neuro-Evolutionary Art") of Penousal Machado which belongs to the group of systems that use methods inspired by the functioning of biological systems.
Examples of general purpose NLG systems developed to date include FUF/Surge, RealPro, Penman/KPML, Nitrogen, Amalgam, and Fergus. Considered broadly, presently known or used NLG systems differ in the way they represent linguistic knowledge. In respect to linguistic knowledge representation, a lexical ontology has become the defining term for the part of a language modeling that excludes the instances, yet describes what they can be. In general terms, a lexical ontology defines the set of representational primitives with which to model the domain of linguistic knowledge. Currently popular classes of models for linguistic knowledge representation include semantic networks models (Quillian), KL-ONE class of knowledge representations (used by Penman/KPML) and the neural network models. Overall, current NLG systems are split between two traditions: stochastic or Markov language models (LM) that are often criticized because they have difficulties explaining future linguistic behavior and creativity; and formal symbolic (rule-based) LM based on the Chomsky tradition that are often criticized because they trivialize the problem of lexical choice.
The present invention has two components: a natural language generating (NLG) component dealing with the generation of English ontology and the generation of English sentences in an improved natural language generation system of the type commonly called formal symbolic (rule-based) systems; and, a composition generating (CGS) component dealing with the generation visual and audio compositions in a system belonging to the general type of abstract art generating systems inspired by language and the functioning of biological systems.
The NLG component of the invention includes methods, systems and software for: transforming an initial input set of ‘thesaurus-like organized’ words (L) wherein said words are represented by means of subsets of synonyms and antonyms to their part of speech specific senses into a rule-based ontology set (Lo) wherein inflection-complement-joining tagged (Jkij) forms of said words are represented by means of syntactical-complement-joining tagged (PJk) sets comprising semantic intermediaries that mediate the syntactic pairing of said words; and rule-based transforming into an output sentence (e) of a sequence (Sn) of interdependent pairing instructions (kgGig) each of said instruction encoding a specific syntactic rule for PJk-based pairing of grammatically correct inflectional forms (Jkij) of words belonging to said rule-based ontology set. The methods provided by the present invention are the result of an improved (based on analyzing the connectivity between syntactically compatible words belonging to the initial input set of words L and inspired by patterns of cortical organizations) rule-based model of language (LM). Unlike other rule-based LM, wherein the choice of the words filling in the abstract framework of a sentence is reduced to a simple secondary to grammar look-up method, according to LM provided herein said abstract framework is: (i) represented as a sequence of interdependent pairing instructions for joining together grammatically correct inflectional forms of words represented by means of syntactical-complement-joining sets of identified semantic intermediaries (of the type commonly called ‘semantic universals’); and, (ii) sequentially filled in with grammatically correct inflectional forms of pairs of words that have at least one semantic intermediary in common.
In respect to its composition generating component (CGS) the present invention provides novel means for producing a potentially infinite number of different visual and audio compositions representing sentences produced from the finite rule-based ontology set provided by the present invention. More particularly, the CGS component of the invention includes software for: converting an input sentence or and a sequence of sentences into an intermediate output sequence of fragments; developing and operating visual and audio pallets (CoP, ShP, SiP, MP, SP); converting an input sequence of fragments (Sf) into an intermediate output visual (Vm) or audio (Am) motif; converting an input motif into an output visual (Vc) or audio (Ac) composition representing a sentence or a sequence of sentences generated by NLG.
The provided by the present invention methods, systems and the software can be applied in the areas of natural language generation and the arts for the purpose of artistic activities such as the creation of audio-visual installations comprising potentially infinite number of different audio-visual compositions generated and displayed in real time, and or the creation of potentially infinite number of different visual patterns or original prints.
The invention has been tested and it can be assumed that software component of the invention (which has been written in Java programming language) is capable of transforming: the initial thesaurus-like organized input into a finite rule-based ontology set; and, inputs from the finite rule-based ontology set provided herein into sentences, audio compositions, and potentially infinite number of different visual compositions.
While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of examples and in the figures and will herein be described in detail. It should be understood, however, that the drawings and detailed description thereto are not intended to limit the invention to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the present invention as defined by the appended claims. It should be also noted that in most of the cases arbitrary symbols and uncommon terms and abbreviations have been used for the purpose of describing the present invention.