Another little tool.

Have you ever needed to take a text file and have it incorporated into your command-line tool where it can be used as a text resource?

Yeah, that’s me. So I built TextTools, which builds a single tool that can be incorporated into Xcode, which converts a text file into a C string that can be included in your code.

You’re welcome.

Abstracting my papers in one place.

Over the years I’ve described several algorithms and written several “papers” which describe those techniques, which I’ve abstracted on a new page on my blog. This includes a detailed description of the LR(1) algorithm used by the OCYacc tool I recently open-sourced, as well as papers on computer graphics and computational geometry. Hopefully they will come of use to someone out there.

OCTools

OCTools is a suite of tools which serve as a plug-in replacement for yacc and lex. The goal
is to provide a (roughly) source compatible tool which can convert yacc and lex
grammars into Objective C output for building parsers that run on MacOS and iOS.

The source kit can be found on GitHub.

Both tools can be compiled on the Macintosh using Xcode, and generate a command-line tool which can be run from the terminal or included into an Xcode project. (Both tools are built in C, so they should be portable to other platforms; however, I haven’t done the port so I’m not sure how successful porting would be.)

The goal of this project was to create Yacc and Lex analogs which generate re-entrant Objective C classes which implement the parsers, and for the Lex analog to use an Objective-C protocol definition for the input file, for maximum flexibility.


More information can be found here: OCTools.

Designing a User Interface? First, create a visual language.

Reading through the comments in this Slashdot article and I’ve noticed a few things.

Apple Is Really Bad At Design

Everyone seems to agree where Apple went off the rails was with iOS 7.

But why–no-one seems to be able to agree.

In the original article linked to by Slashdot, the article’s author seems to hinge his argument on the fact that the new operating system looks ugly. Apple Is Really Bad At Design

In 2013 I wrote about the confusing and visually abrasive turn Apple had made with the introduction of iOS 7, the operating system refresh that would set the stage for almost all of Apple’s recent design. The product, the first piece of software overseen by Jony Ive, was confusing, amateur, and relatively unfinished upon launch. While Ive had utterly revamped what the company had been doing thematically with its user interface — eschewing the original iPhone’s tactility of skeuomorphic, real-world textures for a more purely “digital” approach — he also ignored more grounded concepts about user experience, systematic cohesion, and most surprisingly, look and feel.

Now it almost sounds like the original author was about to stumble on the truth.

But then, he fails:

“It’s not just that the icons on the homescreen feel and look like the work of a lesser designer. They also vary across the system. For instance, the camera icon is a different shape in other sections of the OS, like the camera app or the lockscreen,” I wrote at the time. “Shouldn’t there be some consistency?” While this may seem like obsessive nit-picking, these are the kinds of details that Apple in its previous incarnation would never have gotten wrong.

And, at the bottom of the stack, the essay seems inspired by the iPhone X, which is described repeatedly in the article as “ungainly and unnatural”, “bad design”, “a visually disgusting element.” And by extension the entire Apple environment is described as “fucking crazy.”

The comments in the Slashdot article also riff on the visually consistent or visually pleasing aspects of design:

The result is what you see now in Apple products – a muddled mess of different ideas that just don’t fit together right, and very little actual customer value.


And you know what? We see this focus on the beautiful in many other applications. You can see it in how we describe UX jobs and who gets hired for UI design: UI, UX: Who Does What? A Designer’s Guide To The Tech Industry.

UX designers are primarily concerned with how the product feels. A given design problem has no single right answer. UX designers explore many different approaches to solving a specific user problem. The broad responsibility of a UX designer is to ensure that the product logically flows from one step to the next. One way that a UX designer might do this is by conducting in-person user tests to observe one’s behavior. By identifying verbal and non-verbal stumbling blocks, they refine and iterate to create the “best” user experience. An example project is creating a delightful onboarding flow for a new user.

It’s worth reading the entire thing to understand the state of the industry, or the fact that sometimes

The boundary between UI and UX designers is fairly blurred and it is not uncommon for companies to opt to combine these roles.


You know what is missing here?

From the movie Objectified, a movie anyone who is a designer or interested in design must watch:

I think there are really three phases of modern design. One of those phases, or approaches, if you like, is looking at the design in a formal relationship, the formal logic of the object–the act of form-giving: form begets form.

The second way to look at it is in terms of the symbolism and content of what you’re dealing with: the little rituals that make up making coffee or using a fork and knife, or the cultural symbolisms of a particular object. Those come back to a habit and gives form, helps give guidance to the designer about how that form should be or how it should look.

The third phase, really, is looking at Design in a contextual sense, in a much bigger picture scenario. It’s looking at the technological context for that object. It’s looking at the human and object relationship.

The first phase you you might have something fairly new, like Karim Rashid’s Kone vacuum, which is for Dirt Devil. The company sells this basically, “so beautiful, you can put it on display.” In other words you can leave it on your counter, it doesn’t look like a piece of crap.

Conversely you can look at Dyson and his vacuum cleaners. He approaches the design of his vacuum in a very functionlist manner. But if you look at the form of it it’s really expressing the symbolism of function. The color introduced into it is–he’s not a frivolous person–so it’s really there to articulate the various components of the vacuum.

Or you can look at, in a more recent manifestation in a kind of contextual approach, would be something like the Roomba. There, the relationship to the vacuum is very different. First of all, there is no more human interaction relationship. The relationship is to the room it’s cleaning.

I think it’s even more interesting the company has kits that are available in the marketplace called “Create”. It’s essentially the Roomba vacuum cleaner kit that’s made for hacking. You put a really wacky–I mean, you can create things like “Bionic Hamster” which is attaching the kind of play wheel or dome a hamster uses as a driving device for the Roomba. So it is the ultimate revenge on the vacuum cleaner.

How I think about it myself is that design is the search for form. What form should this object take.

– Andrew Blauvelt, Design Curator, Walker Art Center (around the 20 minute mark)


First, if you are really interested in design, do yourself the favor of watching Objectified.

But the most important part about the section above–to me, the most valuable two and a half minutes of that movie (outside of the opening sequence, of course)–is the notion that design is not just making things “look pretty.”

When you really look at it, the Dyson vacuum is an ugly looking contraption of a machine.

What is important is considering design as a formal process. And much of this “form-giving” really involves the visual language used to express the design of an object, since it is that visual language which helps us understand the object, our relationship to the object, and how we interact with that object.

Look again at the Dyson vacuum. The use of color is deliberate, Specifically, in the photo linked, the yellow parts are all components which either articulate, or which are components that can be moved, taken apart or put back together. The brush, for example, is the yellow thing at the bottom–and can be disassembled easily and reassembled for cleaning. Even the bucket which stores dirt can be disassembled by separating the yellow and gray components, as can the pathway which brings air in from the hose. (If you look carefully you see a small yellow thing just above the wheel; this is a lever which is used to disassemble the components which bring air into the storage bucket, and periodically need cleaning.)

This use of yellow expresses not just the form or function of the device–but clearly articulates how to use the vacuum: how to remove the cord, how to turn it on, how to take it apart and put it together for maintenance.

Yellow, in this example, is part of the visual language, and the consistent use of yellow is used to express function–to guide the user, to make using the vacuum easy.


Perhaps you’ve seen the poster. Perhaps not. But it says:

A user interface is like a joke.

If you have to explain it, it’s not that good.

The Dyson Vacuum, through the consistent use of the color yellow, and through the careful considered shapes of the yellow components, does not need to explain itself. It’s clear how to take apart the air pathway. It’s clear how to remove and clean the basket. It’s even clear how to disassemble down to the replaceable filter. Just consider the yellow parts, and twist, press or pull as the shape suggests.

This lesson reveals a larger one: if you want your user to understand how to use your interface, design and use a consistent visual language.

This includes the consistent use of shape and of color, so that the same shape or visual design performs the same way consistently across your entire application. Further, the same shape or visual design does not have multiple behaviors, and behaviors are not hidden behind different shapes or colors.

The best user interfaces are like the Dyson Vacuum. Good design does not have to be beautiful, but it should be suggestive of functionality. Good design should not hide functionality, though it does not need to be overly obtrusive.

And this is where things start falling apart on later versions of iOS. 3D Touch, for example, allows you to ‘hard press’ on an icon in the desktop of a later iOS device (like the iOS 7 phone) and have a pop-up preview or menu appear.

But how do you know to ‘hard press’ that icon or image?

How many of you who own iPhone 7’s even know you can do this?

You see the same thing with Android with the “long press”, which was used in prior user interface guidelines to show a pop-up menu to reveal further functions. (The new Material design tries to get away from this gesture–but how many people know the way they delete e-mails on earlier versions of Android was to long-tap the e-mail?)

But hell, even Apple knows these “hidden” gestures create a problem:

Adopt Peek and Pop consistently. If you support Peek and Pop in some places but not others, people won’t know where they can use the feature and may think there’s a problem with your app or their device.

And yet Apple never answers that most fundamental question: how does the user even know they can hard-press something in the first place? How do they know they can swipe left? How do they know they can swipe down? How do they know what they can do?

Now on iOS 6, at least some of these questions were answered: buttons had rings around them or were icons arrayed in a row.

But in the tension between visual clarity and being visually clean and visually beautiful, beauty won over understanding. We’ve moved away from the Dyson Vacuum–from the “functionalist” approach to deconstructing design and creating a constant visual language (where checkboxes, radio buttons, drop down lists and the like are easily understood because we’ve consistently used similar art to express the same functionality), towards a flat design which dispenses of any hints of functionality.

Today’s interfaces are as beautiful as users are befuddled.


But it gets worse. In dispensing the logical deconstruction of interface and the creation of consistent user interface “languages” (such as consistently using the same gesture to mean the same thing regardless of the context in which we operate), modern design is less about creating design manuals which express the visual language being used and how to consistently apply that language to solve design problems–and has become more about creating beautiful interfaces, but with absolutely no respect for usability. Usability has become, in today’s world, an afterthought, the equivalent of cargo cult thinking where we echo form without understanding what gives form.

Just look at the first of the core values listed on Apple’s human interface guidelines page, which focuses on Aesthetic Integrity, Consistency and Feedback–but without ever describing what these mean other than from an aesthetic perspective.

It’s no wonder why people in the original Slashdot article claim Apple can no longer design stuff.

Because to us “design” is no longer about the formal relationship of the design, or about the symbolism and content–but about if it looks pretty.

And thus, that notch on top of the iPhone X is considered “bad design” because it “looks ugly” without ever considering if the design (and the design language) serves a purpose or provides functional usability.

Because we don’t have the tools to describe the formal relationship between a human and a computer, all we’re left with is the artistic qualities of that object. So we look at iOS 7 and don’t understand why it fails. We don’t see that in moving towards the “print page” look and feel of user design, iOS 7 has eschewed the very visual hints we need to know if a red line of text is a button, or just a highlighted passage.

We’re left not knowing if the yellow knob on the vacuum cleaner is a lever, or an immovable bit of plastic added for style that we will break if we attempt to twist it.

We don’t understand why iOS 7 (and later versions of iOS) fail, even though the rather ugly Dyson Vacuum succeeds.

And sadly we are left with the ultimate message of the Slashdot article: that iOS 7 (released in 2013) fails because the iPhone X (which has yet to ship) has a notch at the top.


Weirdly, in the process, our design becomes more artistically radical and yet more conservative, more entrenched. Design patterns we barely understand (such as the inverted “L” of web design, or the bottom tab buttons of a mobile device, or using grids to lay web pages out) are reused without understanding what motivates those choices, because they seem familiar to us. We no longer have the tools to create truly revolutionary designs because we no longer know what makes them work.

Worse, in our conservatism mobile devices design becomes web page design but on a smaller scale. Web page design becomes mobile device design but on a larger scale. Desktop applications become mobile web page design but with menu bars and multiple windows.

All of it has become ritual without understanding why.

We’re dancing naked around a fire hoping the Gods will deliver rain to our crops.


You want to create good design?

Then first, you must create a consistent visual language. You must not start with a blank page and start drawing stuff on the page until it looks pleasing. Instead, you must start with a “design manual” for your application.

And you must answer some questions–because in today’s age we no longer have design guidance like we used to.

Questions like “how should I decompose my application?” “How should I consistently present printed information, photos, lists.” “How should I separate sections of information.”

And even more basically, “what does a button look like?” “How does that button behave?” “How can I tell the user that my icon is tappable” (and that could be simply a question of consistent placement), or “how do I tell the user that this button will reveal multiple options?”

Even deeper than this, we must answer fundamental questions such as “what are the nouns–the objects–in my application?” “What are the verbs?” (That is, the actions which operate on those nouns.) “What are the adjectives?” (The modifiers which modify nouns, like color to suggest a value is negative or bad or out of bounds.)

In many ways, because your users have a formal relationship with your application, and because your goal is to clearly communicate how to use what is probably a much more complex and feature-rich application than “pick up vacuum, press button, suck up dirt”, you probably want to eschew the beautiful in favor of the formal. (But even the Kone suggests how it is to be used: the seam bisecting the cone shows the point where the base separates from the enclosed vacuum, the flat top reveals the on/off button.)

And if that means you have little gray rectangles and small dots in places which make your application look more cluttered than you’d like–consider if your user is then able to use your application.

Beyond this, while you should respect the media and the established conventions where your design appears, so long as you are consistent in your designs, you can explore new ideas and new gestures. And you can consider new problems few designers are considering–such as the problem that on larger mobile devices, users can no longer hold the device in one hand and use it with his thumb if your controls are placed at the top of the screen where they are no longer in reach.

Because that’s the bottom line, isn’t it? Not if your application is pretty, but if your users can use it–and continue to use it, and continue to help you generate revenue.

OhMyGoodness, getting rid of affordances makes it harder on users? Who knew?

As seen on /.: It’s official: Users navigate flat UI designs 22 per cent slower

Have you ever bonked your head on a glass door because you had no clue how to open the door–because the architects decided to make the design “clean” by getting rid of anything that ruined the clean lines of the door?

Yeah, that’s our modern UI environment.

Door

I promise you this is a picture of a door. Do you see how to open it?


I mean, look at the examples provided here.

First, let’s dispense with the stupid items listed as “features of flat design”. They list, amongst the supposed advantages of a flat design “functionality”, “close attention to details” and “clear and strict visual hierarchy”–because before the invention of flat design, none of us wanted to deliver functionality, and most of us slopped our user designs together in the same way we slop pigs. (*eye roll*)

And let’s look at the supposed “advantages”: “simplicity of shapes and elements”, “minimalism” and “avoiding textures, gradients and complex forms.”

Which suggests to me the problem with the photo of my door above is that it contains a complex shape and unclear hierarchy which distracts from the functionality of the door.

Here. Let me fix that.

Door2

I know the difference is subtle, but to the purist, makes the door much better looking. No more distractions from the pure essence of a door, one that has a single unitary shape, a minimalist door free of visual distractions.

Right up until you face-plant yourself because you can’t open the god-damned thing.

I mean, look at the animated example they give:

1 itTTnLfQEtyWHBZMwO93aQ

Setting aside the cute (and distracting animation of the weather icon to the side), how does the user know that by tapping and dragging he expands and shrinks a region? How does he know that it doesn’t scroll the page instead? Or that tapping (instead of swiping) would expand or shrink an area? Or that tapping instead pulls up an hourly prediction?

How does he know that swiping left and right gives the previous and next day’s weather prediction?

And notice the design is not entirely free from complex shapes. The two icons in the upper right? Is that two separate buttons, or a single toggle (as the shading suggests)?

Or notice the location in the Ukraine. Is the location tappable? Can we pick a new location?

The key here is that the user does not have a fucking clue. And let’s be honest: there is no delight in a “discovery” which seems more designed to make the user feel like a stupid idiot.

I’m not going to even address the complex and superfluous animations which, while cute, and may even be demanded in some markets, exist only to say how great the application is, but provide absolutely no aid to user comprehension.


Look, I’m not asking for buttons and checkboxes and the like.

It’s not like you have to beat your users over the head; you can have clean lines and still use affordances which subtly guide the user on how to use your application. Just create a consistent visual language so that, for example, all shapes with a small dot in the corner can be resized by dragging.

But I am suggesting if the user needs to spend time figuring out how to open the door, they’re less likely to go through the door.

And you lose users. And revenue.

Some thoughts on designing a computer language.

Designing a computer language is an interesting exercise.

Remember first, the target of a computer language is a microprocessor or a microcontroller. And microprocessors or microcontrollers are stupid: they only understand integers, memory addresses (which are just like integers; think of memory as organized as an array of bytes), and if you’re lucky, floating point numbers. (And even there, they’re handled like integers but with a decimal point position. Stored, of course, as an integer.)

Because of that, most modern computer languages rely on a run-time library. Even C, which is as close to writing binary code for microprocessors as most of us will ever get, relies on a run-time library to handle certain abstract constructs the microprocessor can’t. For example, a ‘long’ integer in C is generally assumed to be at least 32-bits wide–but if you’re on a processor that only understands 16-bit integers, any 32-bit operation on a long integer must be handled with a subroutine call into a run-time library. And heck, some microcontrollers don’t even know how to multiply numbers, which means a * b has to translate internally into __multiply(a,b).

For most general-purpose programming languages (like C, C#, C++, Java, Objective-C, Swift, ADA and the like), the question becomes “procedural programming” or “object-oriented programming.” That is, which paradigm will you support: procedures (like C)? or objects? (like Java)

Further, how will you handle strings? How will you handle text like “Hello world?” Remember: your microprocessor only handles integers–not strings. And under the hood, every string is basically just an array of integers: “Hello world?” is stored in ASCII as the array of numbers [ 72, 101, 108, 108, 111, 32, 119, 111, 114, 108, 100, 63 ], either marked somewhere with a length, or terminated with an end of string marker, 0.

In C, a string is simply an array of bytes. The same in C++, though C++ provides the std::string class which helps manage the array. In Java, all strings are translated internally into an array of bytes which is then immediately wrapped into a java.lang.String object. (It’s why in Java you can write:

"Hello world?".length()

since the construct “Hello world?” is turned into an object.) Objective-C turns the string declaration @”Hello world?” into an NSString, and Swift into a String type, which is related to NSString.

Declarations also become interesting. In C, C++ and Objective-C, you have headers which forces your language to provide a mechanism for representing external linkage. Those three languages also provide a mechanism for representing abstract types, meaning for every variable declaration:

int *a;

which represents the variable a that points to an integer, you must be able to write:

int *

which represents the abstraction of a variable which points to an integer.

And for every function:

int foo(int a, int *b, char c[5]) {...}

You need:

extern int foo(int, int, char[5]);

But Java does not provide headers, so it has less need for header declarations–but then adds the need to mark methods as “public”, “protected” or “private” so we know the scope of methods and variables which can be hidden in C by simply omitting the declaration from the header.

This means Java’s type declaration system can be far simpler than C’s.

And while we’re at it, what types are you going to allow? Most languages have integer types, floating point types, structure or object types (which basically represent a record containing multiple different internal values), array types, and pointer or reference types. But even here there are differences:

C allows the use of unsigned values, like unsigned integers. Java, however, does not–but really, the only effective difference in performing math operations between signed and unsigned integers are right-shift operations and compare operations. And Java works around the former with the unsigned right shift (‘>>>’) operator.

C also represents arrays as simply a chunk of memory; C is very low level this way. But Java represents arrays as a distinct fundamental type, alongside basic types (like integers or floating point values) and objects.

And pointers or references can be explicit or implicit: C++ makes this explicit by requiring you to indicate in a function if an object or structure is passed by value (that is, the entire object is copied onto the stack), or by reference (that is, a pointer is passed on the stack). This makes a difference because updating an object passed by value has no effect on the caller. But when passed by reference, changes to the object can affect the caller’s copy–since there really is only one copy in memory.

Java, on the other hand, passes objects and arrays by reference, always.

This passing by reference makes the ‘const’ keyword (or its equivalent) very important: it can forbid the function being called from modifying the object passed to it by reference.

On the flip side, Java does not have the concept of a ‘pointer’.

And let’s consider for(...) loops. The C language introduces the three-part for construct:

for (initializer, comparator, incrementer) statement

which translates into:

        initializer
loop:   if (!comparator) goto exit;
        statement
        incrementer
        goto loop;
exit:

But Java and Objective C also introduce different for loop constructs, such as Java’s

for (type variable: container) statement

which iterates the variable across the contents of the container. Internally it is implemented by using the Java’s Iterator interface, and translates the for loop above as:

        Iterator<type> iterator = container.iterator;
loop:   if (!iterator.hasNext()) goto exit;
        type variable = iterator.next();
        statement
        goto loop;
exit:

Of course this assumes container implements the Iterable interface. (Pro-tip: If you want to create a custom class which can be used as the container in a for loop, implement the Iterable interface.)

While we’re at it, if your language is object oriented, do you allow multiple inheritance, like C++ where an object can be the child of two or more parent objects? Or do you implement an “interface” or “protocol” (which specifies methods that are required to be implemented but provides no code), and have single inheritance, where objects can have no more than one parent object but can have one or more interfaces, such as in Java or Objective C?

Do you make exceptions a first-class citizen of your language, such as in Java or C++? Or is it a library, such as C’s setjmp/longjmp calls? Or is it even available? Early versions of Pascal did not provide for exception handling: instead, you must either explicitly handle problems yourself, or you must check to make sure that things don’t go haywire: that you don’t divide by zero, for example.

And we haven’t even dived into more advanced features. We’ve just stuck with the stuff that most general purpose languages implement. Ada has built-in support for parallel processing by making threads and synchronization part of the language. (Languages like C or Swift require a library–generally based on POSIX Threads–for parallel processing, though the availability of multi-threaded programming in those languages are optional.)

Other languages have built-in handling of mathematical vectors and matrices, or of string comparison and regular expressions. Some languages (like Java or LISP) provide support for lambda functions. And other languages combine domain-specific features with general purpose computing–such as PHP, which allow general-purpose programs to be written, but is designed for web pages.

Pushing farther afield, we have languages such as Prolog, a declarative language which defines the formal logic rules of a program without declaring the control flow to execute the rules.

(Prolog defines the relationships between a collection of rules, and performs a search through the rules in response to a query. Such a language is useful if we wish to, for example, provide a list of conditions that may be symptoms of a disease; a Prolog query would then list the symptoms, and after execution provide a list of diseases which correspond to those symptoms.)

But let’s ignore stuff like this for now, since my interest here is either procedural or object-oriented programming. (One could consider object-oriented programming as basically procedural programming performed on objects.)


The design of a programming language is quite interesting.

And how you answer questions like this (and other questions that may come up) really determine the simplicity of learning verses the expressive power of the language. Sadly, expressive power can become confusing and harm learning: just look at the initial promise of Swift as an easy and painless language to learn. A promise that has since been retracted, since Swift is neither a stable language (Swift 1 does not look like Swift 4), nor simple. Things like the type safety constructs ? (optional) or ! (forced) are hard to understand, since they rely on the concept of “pointers” and the safety (or lack thereof) of dealing with null pointers (that is, pointers to memory address 0, which typically means “not initialized” or “undefined”).

Or just look at how confusing the C type system can become to a beginner. I mean, it’s easy for a beginner to understand:

int foo[5];

That’s an array of 5 integers.

But what about:

char *(*(**foo[][8])())[];

What the hell???

Often you find C programmers avoiding the “expressive power” of C by using typedefs instead; declaring each component of the above as an individual type.

It is in large part because of C’s “expressive power” (combined with terse syntax) which allows contests like the International Obfuscated C Code Contest to exist: notice we don’t see an “obfuscated Java contest”.

Behold, a runner up in that contest.

But at least it isn’t APL, a language once described to me as a “write-only programming language” because of how hard it is to read, making use of special symbols rarely found on older computers:

(~R∊R∘.×R)/R←1↓ιR

This is the Wikipedia example of an APL program which finds all prime numbers from 1 to R.

No, I have no clue how it works, or what the squiggly marks mean.

Simplicity, it seems to me, forgoes expressive power. Java, for example, cannot express the idea of an array of pointers to functions returning pointers to arrays–since Java does not have the concept of a pointer to a function (that’s handled by the reflection API), or does Java have the concept of pointers. Further, Java does not permit the declaration of complex anonymous structures; first, everything is a class. And second, classes are either explicitly named or implicitly named as part of an anonymous declaration. It’s hard to declare something like the following C++ declaration; you’re forced to break down each component into its own declaration.

struct Thing {
    struct {
        int x;
        int y;
    } loc;
    struct {
        int w;
        int h;
    } size;
};

And it’s just as well; this makes more sense if you were to write:

struct Point {
    int x;
    int y;
};

struct Size {
    int w;
    int h;
};

struct Thing {
    Point loc;
    Size size;
};

It becomes clear that “Thing” is a rectangle with a location and a size.

But then, people often complain that Java requires a lot more typing to express the same concept.


It’s a balance. It’s what makes all this so fascinating.

Quiet Insanity and YACC.

One of the things I wanted to do involves having a parser generated from a grammar, similar to YACC.

But I need the code generated in Objective C. And I need a parser that is re-entrant, so it can be run in a separate thread.

Now there are a number of solutions out there. But what I want is an LR(1) or GLR based parser built via a state machine which can be incorporated into Xcode and which generates Objective C code that can be used in an iPhone or iPad.

And let’s be honest, a lot of advise out there is really fucking stupid. Like this:

Code generation is not the “true way” in dynamic languages like Objective-C. Anything that can be achieved by a parser generator can be achieved at runtime. So, I’d suggest you try something like ParseKit, which will take a BNF-like grammar, and give you various delegate hooks you can implement to construct your parser.

That sound you just heard was my eyes rolling.

The reason, by the way, why you may wish to precompile a grammar rather than compile it at runtime is because generally (a) your grammar won’t change, and (b) the more computational time you can use evaluating the grammar, the more compact the grammar can be that you generate.

So without any real good solutions I thought how hard can it be to roll my own?

Well, fairly hard, mostly because the documentation out there for constructing LR(1) grammars sorta sucks.

So I started writing a document that attempts to describe the LR parsing algorithms out there in sufficient detail to roll my own YACC-like parser.


I haven’t quite figured out the GLR grammar parsing techniques in “Efficient Parsing for Natural Language, so it hasn’t been included yet.

But the rest should be there.

And sadly the whole thing grew to 66 pages in length, even without the GLR stuff.

This is a preliminary version, of course. Eventually I plan to upload all this to my GitHub account.