the necessity to start

5 years ago, i was obsessed with a service where 2 matched users would listen to music & chat together. there's a lady having similar music preference with me. we spent hours talking about music, life, world... in the end, i questioned her "so what's the meaning of life". her answer was: to create.

back then, my interpretation is to brood. i was a metaphysical teenager.

the fact is that, i've been indulging in depression & corruption these 5 years, absolutely wasting my time & doing nothing.
i have serious bipolar disorder and i've been medicated for 3 years. i try to take the minimum dose of quetiapine but still, my thinking ability and intelligence have been hugely affected.
and three years ago, i started to abuse llms. llms make me think less, write less, lazier and lazier, dumber and dumber. i've lost my creativity.
well. i must make a change. i must start to work more, write more, create more. i must abandon brooding and llms. and the situation is that, i don't know what to do. i know a little about computer-science & data-science but that's all.
this will be a long journey. but at least, i need to record it from the very beginning.

what do i need to start

so, let's clear our mind one more time.

i want to create something

i don't know what to create specifically for now

i'd like to record all i learn & do through the journey

my plan is to explore one little subject/project per 2~3 weeks. i'd like to make a video & a blogpost for every this little subject.

(un)fortunately, i find myself terrible at videos, and i've delayed the plan of making videos.

first of all, i need something that helps with video presentation & blog creation. or generally, something that helps with writing. the idea is simple: i just write presentation & blog in a text format and write a compiler to transform it to html webpage or other formats.
and the final solution is intuitive: crafting my custom markup language, stoa.

before crafting my own markup language, i wrote a compiler that transforms a neorg[1] workspace to a html blogsite.

starting from the philosophy implantation

the philosophy is based on this fundamental question: in what i want to reside in?
i care about my sanity most. so i hope this language can formulate a feeling of calm, immersive, pure, serene, a feeling of zen.

i have many metaphors & descriptors for the implatation: holy-minimalism, metaphysical-minimalism, ... anyway, you can sense that minimalism is the central doctrine.

essentially, stoa is inspired by markdown and neorg. and the philosophy is basically derived from what i dislike about neorg and markdown.

minimalism above all

neorg is beautiful & powerful but it's by no means minimal. you may need half a day to understand how to use it and a month to use if effectively. even now, i still need to recheck the spec to remember some parts of the language.
i want a language you can fully understand and use in like, 10 mins.
to achieve this, i'll only add a feature when it's absolutely necessary.
the kinds of blocks i need can be easily listed: paragraph, heading, (unordered) list, codeblock, mathblock, footnote and sidenote. i also need some inline elements: links and markups, where i only need inlinecode and highlight for markups. plus metadata, these are the whole building blocks of stoa.
this makes up a 47 lines grammar of stoa.

independent of text editors

since june, i've migrated to helix and quite likely won't return to neovim. i'll certainly move to different editors in the future, but i shouldn't need to transform my existing files.
to achieve this, stoa must work identically across all text editors. this means zero editor-specific features like syntax highlighting, real-time rendering, extensions, over-engineered building blocks and more.

strictness

stoa is intentionally strict. there's exactly one way to do each thing.
compare markdown's multiple unordered list syntaxes

* a list
+ a list  
- a list

markdown

with stoa's monopoly unordered list syntax.

= in stoa, there's always only one way.

stoa

lexical structure: columns, inlines and continuation rules

stoa is built on two fundamental concepts: columns and inlines.
a column is the abstract building block and inlines are semantic decorations within columns. we use minimal delimiters to identify these columns and markups strictly.
here's the full grammar of stoa:

& inlines

= markup: `inlinecode`, |highlight|, ...
= link/ref: [url], [text | url], [#ref]

& columns

&& (nested) heading and list

= depth 1
== depth 2
=== depth 3

&& sidenotes

-- comment
|| annotation
!! warning

&& fenced blocks

|> scheme
(display "hello stoa")
|>

>>= math
∑ i = n(n+1)/2
>>=

& metadata
:: key  & value

& footnotes

invoke: [#1]
define:
#1 text

& continuation rule

= base item
  continued on same line
     any indentation preserves flow

= new paragraph with empty line(s)

  starts fresh line within block

stoa

a stray section about programming languages

i've tried some languages through the project: python, rust, haskell, janet, zig, perl, ruby...

haskell has always been my favorite programming language. it's also the most annoying, frastrating one, making me suicidal.

odin is my most comfortable language for low level programming.

python (pypy) comes first when writing scripts.

rust would be the first choice if i need to write a serious program.

racket is the most hygienic one for my mental health.

i won't use zig until it has a better documentation.

and stoac, the stoa compiler, is written in nim.

i fail to figure out the reason of choosing nim. i suppose it's just the intuition.

stoac implementation workthrough

stoac processes each stoa file in a single linear pass, line by line, without building token trees or asts. the core abstraction is the lexer, which is a stateful string builder in action.
the single-file processing functionality of stoac is defined as:

proc processStoaFile(inputPath: string, outputDir: string,
      inputDir: string): BlogCard =
   let content = readFile(inputPath).splitLines()
   var
      lexer = Lexer()
      res = Res()

   parseStoaFile(lexer, res, content)

   let relativePath = inputPath.relativePath(inputDir)
   let outputPath = outputDir / relativePath.replace(".stoa", ".html")
   let outputDirPath = parentDir(outputPath)
   if not dirExists(outputDirPath): createDir(outputDirPath)

   injectHtml(res, outputPath)

   return extractBlogCard(res, inputPath, inputDir)

nim

note that the code is suppressed for illustration purpose.

the first procedure we need to check here is parseStoaFile:

proc parseStoaFile(
   lexer: var Lexer, res: var Res, lines: seq[string]
) =
   # when initialized, lexer.lineno is set to -1
   # this would ensure the first line of the file being parsed correctly
   discard lexer.forwardLine
   while lexer.lineno < lines.len:
      lexer.parseLine(res)

nim

lexer.forwardLine instructs the lexer to forward to the start of the next line.

lexer.forwardLine would be called inside lexer.parseLine, so we aren't calling it inside this while-loop.

we forward to parseLine procedure:

proc parseLine(lexer: var Lexer, res: var Res) =
   var ckind = lexer.line.getColumnKind
   lexer.parseColumn(res, ckind)

nim

note that currently, lexer is always positioned at the very start of a column. the first thing it actually does is to recognize the kind of that column.

type
   ColumnKind {.pure.} = enum
      paragraph, heading, list, comment, ...

proc getColumnKind(line: string): ColumnKind =
   var state = "START"
   for c in line:
      let key = (state, c)
      if key in columnTransitions: state = columnTransitions[key]
      elif state == "START": return paragraph
      elif state in columnFinalStates and c != ' ': return columnFinalStates[state]
      else: return paragraph

nim

stoac analyzes the current column kind by using a finite state machine with transition table. for example, the transitions for heading and comment is:

const
   columnTransitions = {
      ("START", '&'): "HEADING_0", ("HEADING_0", '&'): "HEADING_0",
      ("HEADING_0", ' '): "HEADING_F", ("HEADING_F", ' '): "HEADING_F",

      ("START", '-'): "COMMENT_0", ("COMMENT_0", '-'): "COMMENT_1",
      ("COMMENT_1", ' '): "COMMENT_F", ("COMMENT_F", ' '): "COMMENT_F",
   }.toTable

nim

the initial state is START.

for each character in the line, we compute a transition key: (current state, current character).

if the key exists in the transition table, we update the state.

if the key is missing and we’re still in START, the line is a paragraph.

if the current state is a final state, we map it to a specific column kind.

let's try to input && this is a heading into the fsm:

the first transition key is ("START", '&'), which is in the table and we get the updated state HEADING_0.

the next transition key is ("HEADING_0", '&'), which is in the table and we get the updated state HEADING_0.

the next transition key is ("HEADING_0", ' '), which is in the table and we get the updated state HEADING_F.

the next transition key is ("HEADING_F", 't'), which is not in the table but the state is a final state (marked by _F).

we return the column kind heading.

we forward to parseColumn, which is essentially a router to different specific parser combinators based on the cloumn kind.

proc parseLine(lexer: var Lexer, res: var Res) =
   var ckind = lexer.line.getColumnKind
   lexer.parseColumn(res, ckind)

proc parseColumn(lexer: var Lexer, res: var Res, ckind: ColumnKind) =
   case ckind
   of heading: lexer.parseHeading
   of list: lexer.parseNested(ckind)
   of comment: lexer.parseSidenote(ckind)
   else: lexer.parseParagraph
   lexer.appendColumn(res, ckind)

nim

we take the combinator for sidenotes here for example:

proc parseSidenote(lexer: var Lexer, ckind: ColumnKind) =
   let classStr = case ckind
      of annotation: "annotation"
      of warning: "warning"
      else: "comment"
   lexer.cur &= fmt"<div class='sidenote {classStr}'>"
   let parts = lexer.line.splitWhitespace(1)
   let content = if parts.len > 1: parts[1] else: ""
   lexer.parseContentWithContinuation(content)
   lexer.cur &= "</div>"

proc parseContentWithContinuation(lexer: var Lexer, content: string) =
   lexer.line = content.strip
   lexer.char = lexer.line[lexer.offset]
   lexer.parseInlineElements
   while true:
      if not lexer.forwardLine: break
      if lexer.line.isEmptyOrWhitespace: lexer.cur &= "<br/>"
      elif lexer.char == ' ':
         lexer.line = lexer.line.strip
         lexer.char = lexer.line[lexer.offset]
         lexer.parseInlineElements
      else: break

nim

essentially, what it does is:

matches the sidenote type (annotation, warning, comment) and opens the corresponding html tag.

strips the delimiters and parse inline elements in the current line.

advances through lines, parse inline elements as long as the continuation condition holds.

the next procedure is parseInlineElements:

proc parseInlineElements(lexer: var Lexer) =
   while true:
      if lexer.char in markupSymbols: lexer.parseMarkup
      elif lexer.char == '[': lexer.parseLink
      else: lexer.cur &= escapeHtmlChar(lexer.char)
      if not lexer.forwardChar: break

nim

stoac parses inline elements character by character:

if a markup delimiter is found, a markup parser (also a finite state machine) is used.

if a link delimiter is encountered, a link parser is invoked.

regular characters are escaped and appended to the buffer.

note that links and markups are strictly single-line, they cannot span lines. the implementation of parseMarkup and parseLink are trivial. we just skip them here.

with parseInlineElements, we append translated html output of the column character by character to the tmp buffer lexer.cur. we then return to parseColumn and next step appendColumn will append the tmp buffer to the main output string res.document. the tmp buffer is then cleared.

proc parseColumn(lexer: var Lexer, res: var Res, ckind: ColumnKind) =
   case ckind
   of heading: lexer.parseHeading
   of list: lexer.parseNested(ckind)
   of comment: lexer.parseSidenote(ckind)
   else: lexer.parseParagraph
   lexer.appendColumn(res, ckind)

nim

note that at the current stage, lexer is already positioned at the start of the next column. we return to the root parseStoaFile procedure and start to parseLine again, and again, until the end of file.

proc parseStoaFile(
   lexer: var Lexer, res: var Res, lines: seq[string]
) =
   ...
   while lexer.lineno < lines.len:
      lexer.parseLine(res)

nim

to summarize, stoac’s translation pipeline is a strict, single-pass process:

detect column kind via finite state machine

route to the appropriate combinator

parse inlines within the column

append output and continue

closing thoughts

it's actually been a long journey. and still, it's just the starting point of this journey. maybe i should make some wishes here.

my dream is to make my family live a better life.

i hope i can stop using mental health as my excuse of being lazy.

i hope i can create something.

if you are experiencing similar issues, i sincerely wish this blogpost, this blogsite can help you, and you can live a good life.

[1] neorg is the markup language i abused before crafting stoa. it's a beautiful & powerful language, especially inside neovim.