« Implementation of A Pascal Compiler that Uses P-Code as the Intermediate Code | Main | A Method of Variable Dataflow Analysis for a Vectorizing Compiler »

Toward Programming Linguistics

Kanada, Y., Master's Thesis, University of Tokyo Graduate School, 1981 (in Japanese).

[ 日本語のページ ]
[ Kindle edition ]
[ Paper PDF file (6 MB !) (in Japanese) ]

Introduction to this research theme: Programming Linguistics


If we recognize that programming languages are not languages only for machines but are languages that human beings write and read [and used for human-to-human communication], we can see that studies on them rather belong to humanities. This recognition enables possibility to study programming languages using methods of linguistics. This recognition also leads us to find significance of comparative studies on programming languages and natural languages.

Until now, some software scientists have had interests on linguistic research on programming languages, but they seems not to have performed it. So I believe that this paper should establish the basis of linguistic research. Therefore, in this paper, I show linguistic viewpoints on programming languages, show what types of analysis methods can and should be applied to which parts of programming languages, and try to show the direction of linguistic researches.

The most important linguistic viewpoints among the viewpoints shown in this paper are to regard programming languages as systems that include customs (i.e., rules that are not codified, or, not specified in language specifications), and to regard program unit names as relationships to the abstractions. In addition, in a study of applying linguistic methodology to programming languages, this paper lists and explains several structural similarities to natural languages. This paper also try to show the research direction of the fields of programming linguistics from morphology to semantics. Among them, one of the most important branch (of semantics) is research of ambiguity in programming languages.

Linguistic researches mainly target existing programs. Although this paper does not go deep into analysis of existing programs, I believe future research projects will find many clues from this paper.


  1. Introduction
    1. Research fields that relate to programming languages
    2. Significance of contrastive research on programming and natural languages
  2. What are programming lanuguages?
    1. Three expressions of programming languages
    2. Rules of programming languages
      1. Codified rules and non-codified rules 1
      2. Are there non-cofified rules?
      3. Codified rules and non-codified rules 2
      4. Classification of codified rules
      5. Classification of rules by communicative function
      6. Rules of programming languages and rules of natural languages
      7. Flexibility of rules
      8. Complexity of rules
      9. Problems of linguistic research on rules
    3. Independence of programming lanugages
      1. Reasons for examining the independence
      2. Reasons that support dependence
      3. Reasons that support independence
      4. Examination from engineering viewpoint
    4. Understanding meanings
      1. Basic triangle
      2. Definition of "meaning"
    5. Characteristics of languages
      1. "Characteristics of natural languages": A. Communicative function, B. Arbitrariness of symbols, C. Systematic nature, D. Linearity of symbols and messages, E. Discreteness of units, F. Double articulation
      2. Do programming languages have characteristics of languages?: A. Communicative function, B. Arbitrariness of symbols, C. Systematic nature, D. Linearity of symbols and messages, E. Discreteness of units, F. Double articulation
    6. Other characteristics of programming languages
      1. Ambiguity
      2. Closeness and evolutionary nature
      3. Non-speakability of programming languages
      4. Frequency of naming and scope of words
      5. Progams are "used" and rewritten
    7. Non-linguistic part of programming languages
      1. Nature of expressions A. Non double articulation, B. Non-linearity
      2. Meaning of existence of non-linguistic part
    8. Variety and changes of programming languages
  3. Programming linguistics and its research fields
    1. Individual research
      1. Fields of individual research
      2. Syntax
      3. Semantics
    2. Comparative and contrastive research
  4. Morphology -- an initial step
    1. Classification of symbols and their structures
    2. Articulation of identifiers
    3. Structure of variable names
    4. Structure of calls
      1. Structure in language specifications
      2. Structure of function calls
      3. Structure of procedure calls
      4. Consideration on structure of procedure calls from engineering viewpoint
    5. Contraction of identifiers
      1. Restrictions of identifier length and solutions
      2. Contraction rules
  5. Research on ambiguities
    1. General meaning, polysemy, and homonymy
    2. Causes of polysemy and homonymy
    3. Needs of amgiguity resolution
    4. Resolution of polysemy and homonymy
    5. Meaning of ambiguities
  6. Conclusion
  7. References
  8. Acknowledgments

P.S. (2007)

A paper that analyzes programming languages by methods of (human) linguistics, i.e., humanistically. (Linguistically, there is not so many differences in programming lanugages between 1981 and now. However, data description languages, especially XML, which were not widely used, are now widely used. I think we will have important knowledge by analyzing XML linguistically.)

Keywords: Programming linguistics, Linguistics, Software linguistics, Non-codified rule, Arbitrariness, Non-linearity, Discreteness, Non double articulation, Ambiguity, Closedness, Evolutionary nature, Scope, Morphology, Polysemy, Homonymy, Semiotics, Semiology

Post a comment


This page contains a single entry from the blog posted on March 31, 1981 12:00 AM.

Many more can be found on the main index page or by looking through the archives.

(C) Copyright 2007 by Yasusi Kanada
Powered by
Movable Type 3.36