5 reasons to create your own programming language

  • 1. its easier than you think

mostly its about chopping a file up into lines, then reading the lines and doing stuff based on what words are found. a couple loops and a lot of if-then statements are all you technically need.

  • 2. its fun and rewarding

even if the only language you ever make is a “toy” language, you can always joke that you wrote a toy programming language, which is a real accomplishment for anyone.

  • 3. it will help you understand how programming really works

even though serious languages have a lot more expertise go into their designs, the fundamentals are general enough that writing nearly any language will teach you more than it misleads you (unless youre really fishing for ways to be wrong.) languages differ in incredible ways, but they all have a fair amount in common.

  • 4. it will help you appreciate programming and language developers more

just as nothing will give you more insight into applications than writing them, nothing will give you better insight into what language developers go through than becoming one; particularly if youre still tweaking things half a year later. the first month i worked on my language a lot, and less than a year later i only work on it occasionally. theres no vantage like first person.

  • 5. you can write a useful language if you dont make it too general-purpose

general purpose languages are fantastic, but its incredibly difficult to design one with merits that anyone would want to use. far better to think about one or two things your favorite languages cant do, and focus on those as problems to solve. reinventing the wheel for a red wagon is a lot easier than reinventing the landing gear for an airplane, or even a spare for a minivan. try writing a language that makes just one job easier– or, take an app youve written and bolt on a way to automate it using a text file.

 

note: the difference between a lexer and a parser

a lexer takes your program source and separates it into tokens.

the parser figures out what to do with the tokens.

these are often separate things, and if you can code in c (i dont) there are even projects to generate a custom lexer/parser for you.

its increasingly common (though not necessarily more common) to see hand-written parsers, and combined lexer/parsers, so dont think you cant do that.

and if all that sounds too complicated, just follow these steps to a ridiculously simple toy language:

    1. loop through lines

        2. loop through space-separated words in each line

            3. if you see a word that is contained in an array (or dict) of keywords:

                4. get the next word and do a predefined action in response to that

thats a cheap way to do it, but its enough to get started. the experience you get making the simplest of languages will assist you in any future projects to extend or re/design that language, or develop an improved design.

 

 

figtool: arrlen

# arrlen takes stdin and returns the length of each line 

#### license: creative commons cc0 1.0 (public domain) 
#### http://creativecommons.org/publicdomain/zero/1.0/

# $ p=$(echo hello | arrlen | awk '{print $1}') ; echo $p
# 5

# $ for p in $(echo "hello there how are you?") ; do echo $p ; done | arrlen
# 5 hello
# 5 there
# 3 how
# 3 are
# 4 you?
# $

function remfrom z p
    f    z    split f p     join f ""    return f
    fig

forin p stdin
    cr 13    chr
    lf 10    chr
    nextline p    remfrom nextline cr    remfrom nextline lf
    now nextline    len    prints    " "    prints
    now nextline    print
    next

are you sure you want to install windows 10? (y/y)

has microsoft created a new breed of malware?

at the height of the era of shareware, nag screens would constantly bother you to “pay for a registered copy” of software you installed the “trial version” of.

trojans on the other hand, are programs that include automatic or malicious features that the user doesnt want, posing as useful software.

if an operating system you paid for and prefer, constantly bothers you to switch to a different version you dont want (to the point where avoiding an upgrade requires not only effort, but vigilance) what kind of malware is that?

welcome to the era of nag trojans. would you like to upgrade now, or wait until you get home?

 

 

for gifguide2code: many folders in 2 more languages

gifguide2code is one of my first subscribers; i enjoy the intros and they encourage me to make more of my own (i would like to start posting my “figtools” although most are written for pipelines in gnu/linux.)

todays intro is “how to make [a lot] of files.” gifguide uses autohotkey; i will use a fig example for windows and /linux, and another example for bash:

#
# public domain

folders = 200

fs = folders    str
fn = "Folders1_"    plus fs

now = "mkdir "    plus fn    shell
now = fn    chdir

for f (1, folders, 1)
    foldern = f    str
    now = "mkdir "    plus foldern    shell
    next

 

bash has more finicky syntax, but lets you do it in one line:

f=200 ; mkdir Folders1_$f ; for p in $(seq 1 $f) ; do mkdir Folders1_$f/$p ; done # public domain

 

what if everyone wrote a programming language?

what would happen? the literal and obvious answer is that we dont know. but if we allow ourselves to speculate, here are some thoughts springboarded by a few facts and metaphors:

 

  • reality: not everyone is going to write a programming language

i know, but this is speculation. a lot of it is really “what if a lot more people wrote programming languages,” but the question is only fair because i encourage everyone to write one.

 

  • natural selection: most of these languages will not be very good

thats ok; in fact its not even important. if people took a few hours to write a programming language, it is a single lesson that would substantially increase the way they understand programming.

 

  • true scotsman: most of the languages wont even be “real languages”

actually, its nearly impossible to create a “programming language” that isnt a programming language. lots of useful software projects can have scripting engines without creating full-fledged languages, markdown is probably not the very last “standard” of its kind, dont be afraid of being “cute,” because it will be fun.

i made a language that consisted only of the goto statement. it could do 25 different things, and i adapted it from another simple language interpreter someone else wrote. did i mention that i redid their interpreter in my own programming language?

 

  • aiming for the stars: most people cant create a programming language

this is like saying most people cant code. the trick is to make it easier to do. making it easier to create a programming language is something ive been working on for a while. and yes, just like you need things like scratch or basic or logo to get “everyone” programming, you need something simpler if everyone is going to make their own language.

thats not a real problem.

 

  • convergent evolution: too many languages, and no one will know the same one

this isnt true at all. if everyone made a programming language, the features would coalesce into more serious languages. even if a few inspired a new feature (or approach) in serious or educational languages, it would be worthwhile. but although this would likely happen, even if it didnt:

 

  • optimal learning: we should focus on serious languages

i dont agree with this either. first of all, educational languages have made it easier for younger programmers to start earlier and get comfortable with computing. the idea that we can skip this step is like saying we could skip educational programs to introduce young children to reading, because the best time to start learning to read is grade 4.

no one learns how to read “the wrong way,” or how to code the wrong way: thats paranoid nonsense. some people are not good at coding, and other people have not yet learned best practices, but dabbling on a toy piano is not going to prevent you from becoming a skilled pianist or composer.

only in programming is there a fear that early education and practice could somehow “taint” the future student. when there is more (or any) reasonable scientific evidence that this is the case, thats another matter. (even then, its still more paranoid to think it is irreversible.)

 

  • but why? you havent said why…

the reasons vary, so i will give mine.

  • its fun
  • it will make you a better programmer
  • its not nearly as difficult as its made out to be
  • you will better appreciate and understand programming
  • its just one more “programming exercise” worth trying
  • it will inspire you to write lots of other programs
  • i believe it will ultimately push language development forward
  • literacy is not just about reading, but authoring

you can argue against all of those– and i can argue for them. really, my job isnt to convince you to do something you dont want to do; only to make a case for trying it; and to help people that are interested in making it easy enough to actually achieve something.

thats pretty much all i have to say about it for now, i would love to revisit the subject or have other people share their thoughts on the matter.

note that when i talk about “writing a programming language” i am talking about a simple starter language: something that could be done in hours, days, or a week (depending on the level of sophistication.)

i do include writing your own program to compile (probably to another high-level language) or interpret the language; you could also adapt an existing compiler or interpreter.

whether it became a bigger project than that would depend on how much the author got out of the effort. if it was boring and they didnt learn much, i wouldnt necessarily push going forward from there. but if they had fun at all, they could continue with it or write a slightly more enhanced language after that.

if there were a push towards trying, i think a lot of people would realize it isnt that difficult; even if they thought it was something they would never do.

 

 

a review of the 7 basic concepts

theres nothing better for a difficult spot in the learning process (for those wondering: this is around the time functions are being learned) than a little review of whats been covered so far, so heres a quick review: note the substantial range of concepts:

(note: although fig examples are used, most of the concepts illustrated here are common to many languages. the ideas of a “main variable” and “shared lines” are less common, but similarities exist to features in other languages.)

 

variables: as you now know, each one of these holds a piece of information. the piece can be large, but the piece count is singular. each variable has a unique name (you supply the name.)

p = 5 # set p to 5
p = "hello there" # set p to "hello there"
# p is the main variable in these lines
x = p # set x to the value of p (copy p to x)
# x is the main variable in that line, holding "hello there"

 

an array is like a variable that can hold several pieces, and has a name and a numeric index.

p = "hi" ; arr # p is "hi" ; create array named p holding "hi"

 

most lines of code in fig begin with a “main variable” that you supply the name for; such lines are called “shared lines” because several commands can share the line (with the main variable.) unshared lines are used by a few commands in fig: the “block commands” (you will find out about those in a moment) and a small number of others, get the line theyre on all to themselves.

 

input: information goes from the outside world to the computer and/or program, through a physical device such as a keyboard or mouse or camera.

by extension, this includes files which are stored on a physical device. by further extension, this includes anything that simulates a file system. if you like, you could say a program or function has “input” in the form of parameters.

an input command gets information from the world and brings it into a program, usually by putting it in a variable.

x = lineinput # get keyboard input, storing whats typed as x

 

output: information goes from the computer to the outside world, to a physical device such as speakers, a screen or a printer. some devices are in fact input and output devices, such as a modem or other network device (and really, a printer.)

output includes files. through “pipes,” the text output of one program can become the input of another program.

an output command sends information from the program to the outside world.

x = "hello there" ; print # x is "hello there" ; put on screen

 

basic math: as everything is a number to the computer, a great deal can be accomplished with numeric manipulation: or simple math. the good news is that a lot of this math can be abstracted into word-based (rather than numeric) commands, and also the computer will do the math for you.

computing does not require fantastic math skills, but it will help to build math skills anyway. this is simply one more advantage of learning the basics of code and 100, 105, 103, 105, 116, 97, 108, 32, 108, 105, 116, 101, 114, 97, 99, 121.

x = 5 ; plus 2 ; print # x is 5 ; add 2 to x ; put x on screen

 

loops: a loop begins on a single line of code, then there are one or more lines of code “inside” the loop; the end of the loop is marked with another line (at least in fig, and some other languages.)

code inside the loop repeats until the loop is “broken” or “finishes.” a for loop finishes when there are no more numbers in the range specified. a forin loop finishes when there are no more items to “loop (or ‘iterate’) through.” a while loop (in fig) just keeps going until you use the break command, which can end the other types of loop early.

the way break is normally used is with a conditional; that way, the loop continues until it finishes, or until the conditional is true.

while
x "hello again" print # set x, then put x on the screen
wend

 

you can “break” while running the program, using ctrl-c on the keyboard.

a group of lines that start a loop, end a loop, and consist of the “inside” of the loop, are collectively referred to as a “command block.” all command blocks in fig can be ended with the fig command. for the for loop, you may use next instead. to end a forin loop, you could use nextin (or next if you like.) while loops may end with wend. use fig except where you prefer one of these.

 

conditionals: a conditional is a command block which starts with commands such as iftrue, ifmore, ifless, or ifequal. iftrue runs the block it begins if the value (or variable) paired with it is non-zero. this means that if you run iftrue on the string value “hello world”, it will run the code inside the block because “hello world” has more than zero characters. -.001 is also “true” despite being negative, because anything other than exactly zero is non-zero.

ifmore compares two values or variables, (or one value with one variable, etc.) and is “true” if the first value is greater. this includes the alphabetical order of strings, by the way. ifless works like ifmore, except it runs if the first value is less than the other. ifequal runs if both values/variables are equal.

try/except/resume is a three-line block command (with at least one line “inside” each of its two sections) that runs the code inside the second section on the condition that the first section creates a program error. this kind of conditional is used to “trap” and then respond to errors– perhaps to ask for different input, or display a custom error message of your own wording.

ifmore 5, 7
x "this line will not run, because 5 is not more than 7" print
fig

try
x = 5 ; divby 0
except
p "you try, but find you still cant divide things by zero" print
resume

 

functions: truth be told, every command in fig is much like (or actually) a function. in fig, the function command lets you create your own custom commands! a function is a block of code with (optional) parameters defined by you in the top line, and an (optional) value returned by the return command when the function is “called” (that is, when the function is used like any other command.)

in a program, a function has two forms: a definition, and a call. the definition (using the function command) outlines ordinary fig code in a block. the “call” is where you use the name of your defined function anywhere in a program: a function can even call itself from inside its definition!

function onceagain
z "this line is printing again" print
x onceagain
fig

 

a function calling itself is an example of “recursion.” there are situations where recursion is an extremely useful design; in this example, the recursion simulates a loop. fig runs on python, and python has an (adjustable) recursion limit: after it gets about 1000 calls “deep,” a function will not be able to continue calling itself from the same point.

most programs will call their functions from somewhere else than inside the function; in other words, most function calls are not recursive.

fun note: the gnu operating system has a recursive acronym: gnu stands for: gnus not unix. the g in “gnu” stands for: gnu. (the people that put the operating system together named it this way because they thought it was funny. the fact that this is what is considered “funny” to programmers is itself kind of funny…)

most programming consists of these (or variations on these) 7 concepts!

  • variables
  • input
  • output
  • basic math
  • loops
  • conditionals
  • functions