RFC: Standard Image Types in Stdlib

Summary

Add standard image types (colors, pixel formats, color spaces) to Crystal’s standard library to provide a common foundation for the image processing ecosystem.

Note

This RFC is meant to start a discussion about standardizing image types in Crystal. If the community finds value in this approach, anyone interested can help move it forward.

Motivation

The Problem: Ecosystem Fragmentation

Crystal’s image ecosystem is fragmented. Each library defines its own incompatible types:

Library A:

struct RGBA
  property r, g, b, a : UInt16
end

Library B:

module Color
  struct RGBA
    property r, g, b, a : UInt8
  end
end

Library C:

struct Color
  property r, g, b, a : UInt8
end

Result:

  • Developers must learn different APIs for each library
  • No shared understanding of color representation
  • Unclear semantics (premultiplied? color space? bit depth?)
  • Difficult to build tools that work across libraries
  • Each library reinvents basic concepts

Comparison to Other Languages

Go - Standard library provides foundation:

// In stdlib
package image/color

type RGBA struct {
    R, G, B, A uint8
}

type RGBA64 struct {
    R, G, B, A uint16
}

All Go image libraries use these types.

Rust - Community standard via image crate:

pub trait Pixel {
    type Subpixel: Primitive;
    const CHANNEL_COUNT: u8;
}

pub struct Rgba<T>([T; 4]);
pub struct Rgb<T>([T; 3]);

Most Rust image libraries implement these traits.

Python - PIL/Pillow provides foundation:

from PIL import Image

# Standard modes: "RGB", "RGBA", "L" (grayscale), etc.
img = Image.new("RGB", (width, height))
pixel = img.getpixel((x, y))  # Returns (r, g, b) tuple

Most Python image libraries interoperate via PIL’s standard modes.

Java/Kotlin - Java stdlib provides foundation:

// In java.awt
BufferedImage img = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB);
int rgb = img.getRGB(x, y);  // Standard packed ARGB format

Kotlin and Android use java.awt.BufferedImage or android.graphics.Bitmap.

What We Need

A minimal foundation that defines:

  1. Standard color types with clear semantics
  2. Standard pixel format definitions
  3. Standard buffer interface
  4. Standard color space definitions

This allows libraries to:

  • Share a common vocabulary
  • Interoperate naturally
  • Build on a solid foundation
  • Avoid reinventing basics

Guide-level explanation

For Library Users

When all libraries use the standard foundation, you have consistency:

require "image"  # Standard foundation

# All libraries use the same color types
lib_a_color = Image::RGBA8.new(255, 0, 0, 255)
lib_b_color = Image::RGBA8.new(255, 0, 0, 255)

# Same type everywhere
lib_a_color == lib_b_color  # true

For Library Authors

Build on the foundation instead of reinventing:

require "image"

module MyImageLib
  class Canvas
    @pixels : Slice(Image::RGBA8)
    
    def [](x, y) : Image::RGBA8
      @pixels[y * @width + x]
    end
  end
end

Reference-level explanation

Proposed Standard Library Addition

Add Image module to Crystal stdlib (or create minimal crystal-image shard):

Color Types

module Image
  # 8-bit RGBA (premultiplied alpha)
  struct RGBA8
    getter r : UInt8
    getter g : UInt8
    getter b : UInt8
    getter a : UInt8
    
    def initialize(@r, @g, @b, @a = 255_u8)
    end
    
    # Premultiplied alpha semantics
    # Valid range: 0 <= r,g,b <= a <= 255
  end
  
  # 8-bit RGBA (non-premultiplied alpha)
  struct NRGBA8
    getter r : UInt8
    getter g : UInt8
    getter b : UInt8
    getter a : UInt8
    
    def initialize(@r, @g, @b, @a = 255_u8)
    end
  end
  
  # 16-bit RGBA (premultiplied alpha)
  struct RGBA16
    getter r : UInt16
    getter g : UInt16
    getter b : UInt16
    getter a : UInt16
    
    def initialize(@r, @g, @b, @a = 65535_u16)
    end
  end
  
  # 16-bit RGBA (non-premultiplied alpha)
  struct NRGBA16
    getter r : UInt16
    getter g : UInt16
    getter b : UInt16
    getter a : UInt16
    
    def initialize(@r, @g, @b, @a = 65535_u16)
    end
  end
  
  # 8-bit RGB (no alpha)
  struct RGB8
    getter r : UInt8
    getter g : UInt8
    getter b : UInt8
    
    def initialize(@r, @g, @b)
    end
  end
  
  # 8-bit grayscale
  struct Gray8
    getter y : UInt8
    
    def initialize(@y)
    end
  end
  
  # 16-bit grayscale
  struct Gray16
    getter y : UInt16
    
    def initialize(@y)
    end
  end
end

Color Space

module Image
  # Standard color spaces
  enum ColorSpace
    # sRGB (IEC 61966-2-1) - default for most images
    SRGB
    
    # Linear RGB - for compositing and blending
    LinearRGB
    
    # Adobe RGB (1998) - wider gamut
    AdobeRGB
    
    # Display P3 - modern wide gamut displays
    DisplayP3
  end
end

Pixel Format

module Image
  # Describes pixel encoding in memory
  enum PixelFormat
    RGBA8      # 4 bytes: R, G, B, A (premultiplied)
    NRGBA8     # 4 bytes: R, G, B, A (non-premultiplied)
    RGB8       # 3 bytes: R, G, B
    GRAY8      # 1 byte: Y
    
    RGBA16     # 8 bytes: R, G, B, A (premultiplied, little-endian)
    NRGBA16    # 8 bytes: R, G, B, A (non-premultiplied, little-endian)
    RGB16      # 6 bytes: R, G, B (little-endian)
    GRAY16     # 2 bytes: Y (little-endian)
    
    def bytes_per_pixel : Int32
      case self
      in .rgba8?, .nrgba8? then 4
      in .rgb8? then 3
      in .gray8? then 1
      in .rgba16?, .nrgba16? then 8
      in .rgb16? then 6
      in .gray16? then 2
      end
    end
    
    def premultiplied? : Bool
      case self
      in .rgba8?, .rgba16? then true
      else false
      end
    end
  end
end

Rectangle

module Image
  # Axis-aligned rectangle
  struct Rectangle
    getter min : Point
    getter max : Point
    
    def initialize(@min, @max)
    end
    
    def width : Int32
      max.x - min.x
    end
    
    def height : Int32
      max.y - min.y
    end
  end
  
  struct Point
    getter x : Int32
    getter y : Int32
    
    def initialize(@x, @y)
    end
  end
end

Buffer Interface (Minimal)

module Image
  # Minimal interface for pixel buffers
  # Libraries can implement this or provide their own richer APIs
  module Buffer
    abstract def width : Int32
    abstract def height : Int32
    abstract def pixel_format : PixelFormat
    abstract def color_space : ColorSpace
    
    # Optional: direct buffer access
    # abstract def to_unsafe : Pointer(UInt8)
    # abstract def stride : Int32
  end
end

Design Principles

  1. Minimal: Only essential types, not operations
  2. Clear semantics: Explicit about premultiplication, bit depth, endianness
  3. Type-safe: Different types for different formats
  4. Extensible: Libraries can add their own types

What This Does NOT Include

  • Image I/O (PNG, JPEG, etc.) - libraries handle this
  • Image operations (resize, filter, etc.) - libraries handle this
  • Format conversion - libraries handle this
  • Drawing primitives - libraries handle this

This is foundation only - the building blocks.

Migration Path

Libraries can adopt incrementally:

Phase 1: Use standard color types internally

# In any library
alias RGBA = Image::RGBA16  # Use standard type

Phase 2: Expose standard types in API

def [](x, y) : Image::RGBA16

Phase 3: Implement standard interfaces

class Canvas
  include Image::Buffer
end

Drawbacks

  1. Stdlib bloat: Adds types to standard library
  2. Breaking changes: Existing libraries must migrate
  3. Not comprehensive: Doesn’t cover all pixel formats
  4. Opinionated: Makes choices about representation

Rationale and alternatives

Why in stdlib (or minimal shard)?

Stdlib benefits:

  • Always available, no dependencies
  • Stable API, semantic versioning
  • Community standard by default
  • Like Go’s image/color package

Minimal shard alternative:

  • Faster iteration
  • Can be adopted before stdlib inclusion
  • Easier to experiment

Why these specific types?

RGBA8/RGBA16: Most common formats
Premultiplied vs non-premultiplied: Both needed, different use cases
Explicit bit depth: Prevents confusion

Alternative 1: Do nothing

Let ecosystem remain fragmented.

Rejected: Problem will worsen as more libraries emerge

Alternative 2: Pick one library as standard

Make everyone use one specific library’s types.

Rejected: Creates monopoly, not neutral

Alternative 3: Comprehensive image library

Build full-featured stdlib image library.

Rejected: Too large for stdlib, limits innovation

Alternative 4: Just protocols/interfaces

Define interfaces without concrete types.

Rejected: Doesn’t solve the “every library has different RGBA” problem

Prior art

Go: image/color package

package color

type Color interface {
    RGBA() (r, g, b, a uint32)
}

type RGBA struct {
    R, G, B, A uint8
}

Success: All Go image libraries use these types

Rust: image crate

pub trait Pixel {
    type Subpixel;
    const CHANNEL_COUNT: u8;
}

pub struct Rgba<T>([T; 4]);

Success: De facto standard in Rust ecosystem

Python: PIL/Pillow

from PIL import Image

# Standard image modes
img = Image.new("RGB", (width, height))
img = Image.new("RGBA", (width, height))
img = Image.new("L", (width, height))  # Grayscale

Success: Standard modes used across Python image ecosystem

Java: java.awt.Color

public class Color {
    public Color(int r, int g, int b, int a)
}

Success: Standard in Java ecosystem

Unresolved questions

Before RFC acceptance:

  1. Stdlib or separate shard?

    • Stdlib: More authoritative, always available
    • Shard: Faster iteration, easier adoption
  2. Which types to include?

    • Current proposal: RGBA8, NRGBA8, RGBA16, NRGBA16, RGB8, Gray8, Gray16
    • Missing: CMYK, YCbCr, HSV, LAB - add later?
  3. Color space handling?

    • Enum sufficient or need more metadata?
    • ICC profile support?
  4. Endianness?

    • Assume little-endian or make explicit?
  5. Buffer interface?

    • Include in foundation or leave to libraries?

During implementation:

  1. How to encourage library adoption?
  2. Migration guide for existing libraries
  3. Performance implications
  4. Documentation and examples

Out of scope:

  1. Image operations (resize, filter, etc.)
  2. Format I/O (PNG, JPEG readers/writers)
  3. Color space conversion algorithms
  4. Drawing primitives

Future possibilities

Additional color types

struct CMYK8
struct YCbCr8
struct HSV
struct LAB

Color conversion

module Image
  def self.convert(color : RGBA8, to : ColorSpace) : RGBA8
end

Buffer utilities

module Image
  class SimpleBuffer
    include Buffer
    # Reference implementation
  end
end

Format registry

module Image
  def self.register_format(name : String, reader : FormatReader)
end

But these are future enhancements - start with minimal foundation.

11 Likes

Ruby has a library called red-colors.

The author, mrkn, is a Ruby committer who specializes in creating libraries for numerical calculations. Although this library is not very well known, it may be worth taking a look at.

I think image types should be provided as standalone shards, similar to crystal-db, even if they are official.

3 Likes

Oh, you are taking the idea further. Nice. I was thinking more as a shard but this make sense too.

The question here is how much we favor interoperability with other libraries and tools outside crystal, vs just interoperability with other crystal libraries. There is definitely an argument for BGRA8 (*) as well (probably in addition to RBGA8), due to how common it is (originally due to being the native format in VGA). There are definitely performance questions here, as images can be big enough that the swizzling of the colors can be noticeably slower (regardless of what choice we take there will be different cases where we are hit with this).

It is probably not a deal-breaker either way, though - for my library not having ARGB will mean I won’t point users towards what is guaranteed by by the standard to exist, but in practice I think RGBA is implemented by most relevant places anyhow.

(*) also known as ARGB32, aka ARGB8888 aka ARGB8, that is ARGB8 where the name is from it being defined as being encoded into an Int32, so not memory positions but rather positions inside the Int and thus being reversed by x86 endianness

Endianness?

  • Assume little-endian or make explicit?

There is also the choice to make it memory-defined, like typically for RGBA8, and thus having it not matter.

I would start with a shard to be able to iterate faster. Then we can vendor or merge it if appropriate.

Given your participation in the community I would trust any library of yours. If you rather have it in a crystal named org, we could use crystal-community for a start. I would then be in favor of even moving it to crystal org without affecting maintainers.

On the design I think I like the mixim for interoperability and specific structures for memory layout.

5 Likes

Thanks for originally raising this issue - you identified the fragmentation problem that sparked this discussion.

Oh, you are taking the idea further. Nice. I was thinking more as a shard but this make sense too.

Yeah, I went with stdlib in the RFC since that’s what Go does, but I’m not attached to it. A foundation shard might actually be more practical for iteration. Open to whatever the community thinks makes sense.

There is definitely an argument for BGRA8 (*) as well (probably in addition to RBGA8), due to how common it is (originally due to being the native format in VGA). There are definitely performance questions here, as images can be big enough that the swizzling of the colors can be noticeably slower

That’s a solid point about BGRA8. The performance cost of swizzling on large images is real, and it’s native to a lot of graphics APIs. Worth considering whether BGRA8/BGRA16 should be in the foundation alongside RGBA variants. What do others think?

There is also the choice to make it memory-defined, like typically for RGBA8, and thus having it not matter.

Good clarification - for 8-bit types, endianness isn’t an issue since each component is 1 byte. The memory layout is the format. For 16-bit types, we’d need to be explicit (I mentioned little-endian in the RFC, but maybe that needs more discussion).

Thanks for the guidance on the approach!

I would start with a shard to be able to iterate faster. Then we can vendor or merge it if appropriate.

Makes sense - shard-first approach gives more flexibility to iterate on the design.

Given your participation in the community I would trust any library of yours. If you rather have it in a crystal named org, we could use crystal-community for a start. I would then be in favor of even moving it to crystal org without affecting maintainers.

I appreciate the trust! I’m open to helping get this started, though I want to make sure there’s broader community buy-in on the approach first. If others think this direction makes sense, crystal-community seems like the right home for it.

On the design I think I like the mixim for interoperability and specific structures for memory layout.

Glad the approach resonates! Just to confirm - you’re thinking modules/protocols for abstract interfaces (like Buffer) plus concrete structs for specific memory layouts (like RGBA8, BGRA8)? Want to make sure I’m understanding correctly.

2 Likes

Thanks for the pointer! Took a look at red-colors. Interesting that they include HUSL, xyY, and XYZ colorspaces alongside the usual RGB/HSL.

HUSL (Human-friendly HSL) is perceptually uniform - useful for generating color palettes where steps look evenly spaced to humans. XYZ and xyY are CIE colorspaces - device-independent representations that are foundational for color science.

The RFC currently focuses on RGB-based types (RGBA8, etc.) for actual pixel storage. Advanced colorspaces like LAB/XYZ/HUSL are important for color science work, but I’m not sure they belong in the minimal foundation - they’re more specialized use cases. What do you think?

I think image types should be provided as standalone shards, similar to crystal-db, even if they are official.

Agreed - that aligns with what the RFC already proposes as one option. Keeps it flexible while still being a recognized standard.

Awesome. When ready name the repo and I or others can create it.

Correct. I think it offers flexibility. As usual with unions we might need to see if the dispatch affects the performance or not. If they are present in arguments they should disappear but if the module are used in ivars it might get translated to a big union.

Awesome. When ready name the repo and I or others can create it.

Thanks! Let’s wait for a bit more feedback from the community on the design (especially around BGRA8 and other format questions). Once there’s consensus on the approach, we can figure out a good name together.

Correct. I think it offers flexibility. As usual with unions we might need to see if the dispatch affects the performance or not. If they are present in arguments they should disappear but if the module are used in ivars it might get translated to a big union.

That’s a good point about the union performance implications. For the foundation types, I’m expecting most usage would be concrete structs in ivars (like @pixels : Slice(RGBA8)), which should avoid the union dispatch issue. The module interfaces would mainly be for library interop where flexibility matters more. But you’d know better than me about the compiler behavior here - appreciate the heads up on that consideration.

That makes sense. If there ends up being need for it we could either expand or add a separate gem for more obscure variants.

And avoiding unions where possible seems more or less as a requirement to me - you really don’t want buffers of unions, and the field names will differ so anything involving unions will be a pain. There might be unified API around serialization and other conversions or so, but not much else.

1 Like

IMO the shard should contain lots of types.

It’s the whole point of the shard: experiment with the ideas, propose useful types for broad use cases, refine them together, and eventually only import the very general ones into stdlib… or just keep the shard as a fundation to the community.

3 Likes

I’m sadly not working with Crystal day-to-day at the moment and couldn’t find an answer in the language reference, so: can we reopen an enum to add to it? If not, how would shards add color spaces?

(This question brought to you by me giving up on writing my own conversions to and from Oklab for an OpenCV project…)

No, not possible:

# app.cr

enum Foo
  One
  Two
end

enum Foo
  Three
end

$ crystal run app.cr
Showing last frame. Use --error-trace for full trace.

In app.cr:7:1

 7 | enum Foo
     ^
Error: can't reopen enum and add more constants to it

Perhaps the same way crystal-db register different drivers by building a Hash mapping the URI scheme used for them (mysql://, sqlite3://, etc) to the actual implementation (Mysql::Database, etc)

Cheers.

1 Like

Thanks for the clarification! I suppose in retrospect I should have just tried it myself. :sweat_smile:

My concern is that the proposed ColorSpace and PixelFormat enums have a small subset of the possible options out of each category. Since enums can’t be reopened, library authors would need to make their own representations. For example,

module ColorThingy

  def doSomething(
    buffer : ImageBufferImpl,
    originalColorSpace : (::Image::ColorSpace | ColorThingy::ColorSpace),
    newColorSpace : (::Image::ColorSpace | ColorThingy::ColorSpace)
  )
    # do something
    pass
  end

end

It just seems awkward. I’m definitely open to an alternative workaround that I’m not seeing (e.g. if any of the referenced libraries in other languages uses enums this way and has an established approach to extension), but I wanted to bring up the potential issue before it cropped up in practice.

With classes?

abstract class ColorSpace
end

module Image
  class RGB < ::ColorSpace
  end
end

module ColorThingy
  class CMYK < ::ColorSpace
  end
end

def doSomething (cs1 : ColorSpace.class, cs2 : ColorSpace.class)
  p! cs1, cs2
end

doSomething Image::RGB, ColorThingy::CMYK
1 Like

That certainly seems like a valid alternate approach (though the .class is a bit cumbersome, and I’m sure there’s a way around that). I’m not saying there’s no solution, just that I don’t think enum is it.

Here’s another extensible approach (with its own issues):

module Image
  class ColorSpace
    getter name : String
    getter components : Array(String)
    
    def initialize(@name : String, @components : Array(String))
    end
  end
end

module Image
  RGB = Image::ColorSpace.new("RGB", ["red", "green", "blue"])
end

module ColorThingy
  OKLAB = Image::ColorSpace.new("Oklab",["luminosity", "a", "b"])
end

def doSomething (cs1 : Image::ColorSpace, cs2 : Image::ColorSpace)
  p! cs1, cs2
end

doSomething Image::RGB, ColorThingy::OKLAB

I believe this “leaks” a bit because ColorSpace#components returns the mutable internal array by reference. Also, I find it unsatisfying to use constant instances of a normal type for this purpose. The compile-time function resolution you get from the ColorSpace.class approach is definitely ideal.

I looked at how Go, Rust, and Python handle ColorSpace in their image foundations.

Finding: None of them include it.

Go’s image/color - just provides RGBA, NRGBA, Gray types. No ColorSpace. Implicitly assumes sRGB.

Rust’s image crate - provides pixel format types. Color space conversion is in separate crates (palette, kolor).

Python’s PIL/Pillow - string modes like “RGB”, “RGBA”. No ColorSpace type. Assumes sRGB. ICC profiles are opt-in.

Why they skip it:

  • Most images are sRGB anyway
  • Keeps foundation simple and fast
  • Users who need color management use specialized libraries

For Crystal:
Skip ColorSpace in the foundation. Just provide pixel types (RGBA8, NRGBA8, etc.) and document that RGB means sRGB. Users needing color space conversion can use a separate shard.

This avoids the sealed enum problem and keeps the foundation minimal like the RFC intended.

5 Likes

i’m thinking of crystal-image as repo name, but i’m open to suggestions/recommendations.

Just a comment - “sealed enum problem” is actually moot, since it ain’t a viable solution to use anyway.

1 Like

crystal-community/crystal-image · GitHub created and I’ve granted you access to it. We can always iterate if needed.

2 Likes