Incremental compilation exploration

I’m still waiting for that “ah ha!” moment!

Is it mainly the standard library?

Echoing several others in this thread, the stdlib is one of the best for my most common use cases (back-end HTTP services). I wrote up a quick server-rendered web app the other day with it to experiment with the Discord API using OAuth2, HTTP server, and HTTP client. The whole thing was 275 lines of code (114 of which were the Discord JSON::Serializable types) and I only used a couple of my own shards for HTTP server routing and storing sessions in Redis.

Is it being able to prototype without specifying types in methods?

This is one of those things I love but I feel is overused. I almost always add argument types to method signatures unless I’m intentionally duck-typing because it typically gives better error messages (because the type mismatch will occur in the method you want it rather than calls deep). That said, when that duck-typing is useful, it’s really useful, so I agree with others that keeping it in the language is important to me.

That actually brings me to one of my actual favorite things about any programming language in existence: union types! It almost completely solves the problem of NoMethodError in Ruby, NullPointerException in Java, and undefined is not a function in JavaScript. And it does it by taking a page out of Ruby’s own book by making Nil its own type, so if something can be nil, you have to handle that scenario because it will inevitably be nil at some point and you’ll hate that even more. It’s such an elegant way to solve that problem.

Not to mention, sometimes you just need to let things be different types sometimes, especially when you’re receiving it over the network from an untrusted system. This makes things like oneOf in OpenAPI trivial to implement, especially nested inside another schema.

Is it the concurrency model?

This is a big thing that I think a lot of Crystal devs get to take for granted. Most of us aren’t using it directly if we’re working on HTTP services, but we still get the benefit of it over thread pools alone. For example, spinning up a new fiber per request in HTTP::Server offers a lot of fantastic flexibility that web servers in some other programming languages can’t give you. For example, in Ruby, you’re typically limited to one request per thread or process, so no other requests can even begin processing until one of those is freed up. This opens up ideas a lot of fantastic ideas like an elastic connection pool rather than parking a connection on thread-local memory like ActiveRecord does because you can’t do that in Crystal. ActiveRecord could’ve done something like DB::Pool, but it was easier to hang it on the threads themselves because that’s just how things are done in Ruby, despite it using more DB connections on average.

Is it macros and compile-time reflection?

This is another thing I think we could easily take for granted. Even if you’re not writing your own macros in your apps, you’re almost certainly getting the benefit of macros. JSON::Serializable and other implementations following the same convention (YAML::Serializable, DB::Serializable, MessagePack::Serializable, etc) may be hairy piles of macros that very few people want to mess with but the concise, expressive code they enable us to write is downright incredible.

What if we require you to require all file dependencies up-front?

Not gonna lie, explicit dependencies is one thing a lot of people hate that I actually like and I would love to see this added to Crystal. Declaring dependencies per-file can be annoying and repetitive, and I remember being delighted when I first started using Rails back in 2005 and I didn’t have to do that, but that was short-lived. Once I got into a large Rails app I got frustrated because it became harder to know where things were defined. Is it a gem or my app? If it’s my app, is it in app/, lib/, config/initializers/, etc? If it’s in app/, is it in models/, services/, or any of the dozens of other directories there?

Urging people to think about their dependencies is a good thing, imo. Sometimes when looking at Java or JavaScript files, you can tell right off the bat that an object does too much simply because the series of import statements is huge. It makes it feels heavier at the very first glance, without even looking at the number of lines of code. When a language or framework removes the irritation of dealing with dependencies, it also almost necessarily removes the feedback that that irritation provides.

16 Likes

Could this not maybe be a thing where both ways are possible?
If you want / need clarity then explicit includes which also gives you a possible speed-up in compile time (if I understood asterite blog)
And if you do use at least one explicit include you need to specify everything, or at least what is not explicitly included gets thrown out as a warning on compile.

Incremental compile

Increment means a ‘discrete increase of something’. Discrete is a unit of some kind.

When increment compile you will extend/update what’s already saved as compiled collection with a new unit.
For software the unit is a file with text in a certain language. But that unit is not easily compiled fast.

But what about the approach

each unit feed for compilation holds just one type of definition or implementation!

How to start

Today’s old-fashioned style of structuring program source must be modernized, I think.

Split each select module into one file for each type and flatten into one directory? Like an enum is a single file, an alias too. Each method is a single file! (Or three units, doc,
signataure and body). A module holds lines with annotated source (or include ?). Ecah class has a list of methods to include/source. Do a smart distinguish of methods as public or private.

This restructuring must be compileable. And restructured - assembled - into self contained sources. For backup reason? Perhaps the shards.yml tells which modules is to - or might - be compiled incrementally.

After this one time initial crystal restructering the compiler will compile the modules in the repo/project as usual but with some further information besides today ./cache/crystal/repo… (with .bc and .o) Call it metainfo here.

Probably defs/refs for each implemented type. List of things to include in a module, list properties in a class. Or annotated filepath to .cr source

One increment

Experimenal

Feed the compiler with a change request for some unit

> crystal_inc -c 'updatecontent alias' --ident 'MODULE_R::ResultRow' --content 'alias ResultRow = Array(TableRow)'

The compiler will try to do a recompile the component and all units which are depedning on MODULE_R:: ResultRow and update all .cache if succesed. If will fail for two reasons, I think

  1. The definition was wrong. Dont’ update the back-yard
  2. The defintions was correct. Update that unit, but mark all/first dependet as error-marked.

If there was a succes in the compiling and we want to run then we must tell the compiler to build an imgage. That distinction - 2 different usage - is very important.

The compiler can compile into .o - no automatic image build
The compiler can link into an image

IDE

Compiler as a server

Now we are talking!

Start the compiler as a server listening on a port.
(First experimental is using a WEB client as IDE but later on some kind of ‘native IDE’ is probably more suitable)
The programmer will see a list of all modules. They can select a module and expand
it into components. Then select a component for UPDATE! Just one component at a time for UPDATE. They can select things to view as read only. Also see aggregations like all methods in a class.

(Perhaps a prototyping the web by extracting info from docs/index.json and display a random unit to edit)

This is like Smalltalk

Well, you edit the content of the unit. A textedit is used for content in enums, for the body of a method, for the body af a macro, struct?

Other things you select for rename, delete, show def/refs. Or add new component. All changes will cause a corressponding request to the server to compile. But no link, be default.

The incremenal is now the window with changed text. Tell the server. This unit is changed.

If an error is detected you get feedback. Pick correction step from the feedback list.

When everyting is changed and compiled you can tell the server to build an image!! There is no automatic build for each succes incremant compile!

Issues

  1. Perhaps a ‘source’ "<filepath>" is the first thing to introduce? With annotations. Any extension accepted.
  2. gitbub and versions?
  3. Reliable source content and structure. What happens to the original source?
  4. macros, generics, circular references, shards
  5. Heavy support of two compilers and tools - or?
  6. It’s a radical change how to manage the very soft software into harder software. True for most software (beside Smalltalk and …?)
  7. Negative reactions from programmers. Don’t touch my sourec. Can’t use my favourite editor. I’m not a fill-in-the-form programmer.
  8. formatting, syntax color, suggestion (they will come…)
  9. In all perhaps most suitable for business application. But one can mix modules of ‘both’ kinds in a repo.
  10. Will there be a unsolvable situation on how too select unit compile order due to cyclic dependecncies?

Smalltalk

My experiences are from some months in the early -90. Prototyping in a paintshop at a car manufacture. Lot of drawings and interactions.

The source development environment was very powerful. Traditional Smalltak-80 views.

But

  1. When you edit a method (receiver) and save it, it would be compiled. You can immediate run but the IDE will not check whether a class/intance implement a method. Impossible at compile time. (But crystal do of cource!) So you might get the message ‘MessageNotUnderstood’ when running. Very annoing.
  2. The views of a module was restricted. It became cluttering when you must start another browser in order to inspect other classes or components. Also views on all methods for a class was not there. And working views was not tailorable.

If you aren’t familiar with Smalltalk do at least look into some views by ‘www search smalltalk’ and choose images.

Source

Who’s in charge. For safety reason the metainfo can’t be safe so an ordinary source complex is very source. Must be able to share and version. So when the crystal_inc compiles required units it must reproduce some sources. At any time this source must be complete compileabale as the whole module source. crystal_inc must also be able to reproduce all source looking like traditional source without any source

The content of safe source could be like the following outline. (It’s mostly list of things to somehow include)

module A
	["Alias"]
	source "all_alias.txt"
	["Enums"]
	source "all_enums.txt"
	["Classes"]
	source "all_classes.txt"
	...
end

I read the dev.to blog post and am excited about the possibilities of Crystal.
At the same time, I am amazed that we can get object dependencies with so little additional code.

Really enjoyed reading those articles! Provided lots of insight, curious for what happened? Was progress made, new discoveries etc?

Sorry for necro.

1 Like

I didn’t have more time to work on this.

I was just exploring this, but my goal was to conclude that it’s impossible to do incremental or modular compilation for Crystal without changing the language. At this point I don’t need a thorough research to conclude that.

As a person who does programming for a hobby, I have the following imaginations.

Ruby adopts dynamic typing. The declaration of types is not done inside the code, but in external files. I wish this strategy could also be realized in Crystal. Because it is possible to define various types for the same code. Users might explicitly specify types, or save the type inferred by the Crystal compiler during the last compile as a .lock. Even within the same library, different type files might be generated depending on the caller. When the compilation doesn’t go well, you clear the type files.

There are already many statically typed languages that require type specification, have difficulty with duck typing, and exhibit high performance. I hope for Crystal, which has a different charm from these, to grow while preserving its uniqueness.

Could you name one, please?

1 Like

Perhaps you misread? Or did I? Java, C#, etc., they all fall into this category: require type spec, have difficulty with duck typing, exhibit high performance. Am I missing something?

1 Like

Ah, yes, I misread. Crystal doesn’t require type specs, that’s why it’s impossible to do incremental compilation.

And requiring types in some cases (gradual typing) isn’t going to solve this either. What we need to do is to require types in all methods. Maybe introduce interfaces or protocols, etc. It’s the only way.

Otherwise, I suggest using Crystal for small projects, microservices, etc. Please don’t use it for large projects!

Crystal is amazing for microservices, tbh. Not just because of shorter compiles for smaller codebases (30-60 seconds for most of mine, even with --release) but also very low memory consumption in production. All my services use anywhere from 1-50MB (the high end would be a lot lower if not for this issue, but it can be mitigated for non-exceptional failures), with most services hovering in the 6-15MB range.

3 Likes

Not exactly the thing you’d like to hear from a core team member…

Used to be all the rage, now I hear monoliths are back in fashion…

5 Likes

Thank you for answering.

For me, I’d rather have to write types for everything in order to enable the improvement of:

  • Compilation speed - Lucky takes ~37 seconds to add one model to an empty-yet-compiled project on my laptop.
  • LSP - Working consistently (all code) and quickly (currently takes seconds even on small projects, most other LSPs work on the order of tens or hundreds of milliseconds).

Crystal :heart: is the nicest language I know of, all categories. It’s the one language that I can see myself using for everything. It’s also somewhat painful to develop with due to the above issues, especially for projects that exceed a few thousand lines, especially for libraries where the LSP stops working (errors, auto complete, suggestions) as the code isn’t yet called by anything.

What’s the core team’s position on required type declarations? Is there a discussion about this?

I realize that everyone has different goals and priorities, but for me having to write 10% more code is a small price to pay compared a scenario where the suggested solution is not using the language… :confused:

4 Likes

The lack of incremental compilation doesn’t make Crystal any less powerful of a language. Crystal is very capable and is only improving by each release, and while things like incremental compilation would be nice, I would very much prefer the language we have now rather than one with all these fancy features and poor performance/memory management/etc.

3 Likes

… one with all these fancy features and poor performance/memory management/etc.

Incremental compilation and general compiler improvements wouldn’t change the language as we use it, sans the addition of required type annotations, which already exists in the language.

This means that this:

getter names

… would become this:

getter names : Array(String)

For me, this isn’t a problem. In all honesty, I don’t understand why it would be a problem for anyone given the very expensive trade off it comes with.

I strongly believe that Crystal needs at least one order of magnitude better compile times in order to remain attractive in a competitive area where Rust programs compile in milliseconds, much thanks to incremental compilation.

1 Like

@Devonte That’s a bit of a straw man argument. Incremental compilation isn’t a “fancy” feature nor would it affect runtime performance.

The case being made is about compilation performance. When a codebase crosses some LoC threshold, compilation speeds drops quite a bit. If you’re iterating on some piece of code, the change/compile loop can be pretty slow. The example above mentions 37 seconds.

1 Like

Perhaps I should have been clearer in my post: I know that incremental compilation wouldn’t affect runtime, my point is that priorities from active core members were on more important features like performance and memory management. That’s not to diminish the benefits of incremental compilation, but I think it’s significantly less important compared to those features.

I would also argue in this context that incremental compilation is a fancy feature, in the sense that it has only relatively “recently” become the focal point of many languages. I disagree with the notion that newer languages like Crystal should be moving towards its adoption simply because other languages are doing so, and while some valid points for incremental compilation have been made, a lot of them have some relation to type inference (if not being completely based on that) which seems to be the underlying issue. I don’t think compilation speed would be much of an issue if some of the suggestions to managing type inference were implemented, and that would likely make incremental compilation a redundant/lesser priority feature.

3 Likes

Well, some of us would be sad to lose the artistic freedom that

def tickle_it(obj)
  puts obj.tickle
end

gives.

4 Likes

I don’t think compilation speed would be much of an issue if some of the suggestions to managing type inference were implemented, and that would likely make incremental compilation a redundant/lesser priority feature.

If that means a decrease in compile times by two orders of magnitude, I’m all for it. My suspicion is that those numbers, without incremental compilation, are very hard to come by.

1 Like