GitHub - eddelbuettel/ulid: Universally Unique Lexicographically Sortable Identifiers in R

ulid -- Universally Unique Lexicographically Sortable Identifiers

Description

As described in the ulid specification repo, and slightly edited here, UUID use can be suboptimal for many uses-cases because:

It is not the most character efficient way of encoding 128 bits of randomness
UUID v1/v2 is impractical in many environments, as it requires access to a unique, stable MAC address
UUID v3/v5 requires a unique seed and produces randomly distributed IDs, which can cause fragmentation in many data structures
UUID v4 provides no other information than randomness which can cause fragmentation in many data structures

Instead, an alternative is proposed in ULID:

ulid() // 01ARZ3NDEKTSV4RRFFQ69G5FAV

with the following properties:

128-bit compatibility with UUID
1.21e+24 unique ULIDs per millisecond
Lexicographically sortable!
Canonically encoded as a 26 character string, as opposed to the 36 character UUID
Uses base32 by Crockford for better efficiency and readability (5 bits per character)
Case insensitive
No special characters (URL safe)
Monotonic sort order (correctly detects and handles the same millisecond)

 01AN4Z07BY      79KA1307SR9X4MV3

|----------|    |----------------|
 Timestamp          Randomness
   48bits             80bits

Components

Timestamp

48 bit integer
UNIX-time in milliseconds
Will not run out of space until the year 10889 AD.

Randomness

80 bits
Cryptographically secure source of randomness, if possible

Sorting

The left-most character must be sorted first, and the right-most character sorted last (lexical order). The default ASCII character set must be used. Within the same millisecond, sort order is not guaranteed.

What Is Supported:

The following functions are implemented:

ts_generate: Generate ULIDs from timestamps
ULIDgenerate: Generate ULIDs
unmarshal: Unmarshal a ULID into a data frame with timestamp and random bitstring columns
ulid: Alias for ULIDgenerate

Installation

The package can be installed from CRAN via install.packages("ulid"). Development versions can also be installed from this repository or from r-universe via

r <- c('https://eddelbuettel.r-universe.dev', 'https://cloud.r-project.org')
install.packages('ulid', repos = r)

Usage

ulid::ULIDgenerate()
## [1] "0001EKRGEEV98QP062VNRX31P2"

(u <- ulid::ULIDgenerate(20))
##  [1] "0001EKRGEEV5XMP54RRRWAK318" "0001EKRGEEKX7VC0PF75AZJXHP"
##  [3] "0001EKRGEEXENNCQEH4KCH8QAD" "0001EKRGEEY41HJ6GMXRV1BQBA"
##  [5] "0001EKRGEE6HVD7ACWZ52MTVCJ" "0001EKRGEEQWXMPXGC0DGQN32B"
##  [7] "0001EKRGEE6W13BK92EF1RXYT7" "0001EKRGEE5A31H38NJFGTK8PC"
##  [9] "0001EKRGEEG2GXS53QY9F3M0A9" "0001EKRGEEDA3Y6Y0T52WTS6RM"
## [11] "0001EKRGEE5WS2S3D9KY3F5H9Y" "0001EKRGEE24SZW5NATAADAY9Q"
## [13] "0001EKRGEEBEG51QCKXPM8ZS16" "0001EKRGEE1ZC1QY7RCJR9VJ0B"
## [15] "0001EKRGEECJ50Z4FXM4HW6XWG" "0001EKRGEEER84JP8WTXV5DWV8"
## [17] "0001EKRGEEW3ABA82GZSRXN1RB" "0001EKRGEEAA60CYFGR8832JD6"
## [19] "0001EKRGEE6W5ARCFHH6T75FPZ" "0001EKRGEE5WT4XNP7NS69BM3X"

unmarshal(u)
##                     ts              rnd
## 1  2019-07-27 08:21:34 V5XMP54RRRWAK318
## 2  2019-07-27 08:21:34 KX7VC0PF75AZJXHP
## 3  2019-07-27 08:21:34 XENNCQEH4KCH8QAD
## 4  2019-07-27 08:21:34 Y41HJ6GMXRV1BQBA
## 5  2019-07-27 08:21:34 6HVD7ACWZ52MTVCJ
## 6  2019-07-27 08:21:34 QWXMPXGC0DGQN32B
## 7  2019-07-27 08:21:34 6W13BK92EF1RXYT7
## 8  2019-07-27 08:21:34 5A31H38NJFGTK8PC
## 9  2019-07-27 08:21:34 G2GXS53QY9F3M0A9
## 10 2019-07-27 08:21:34 DA3Y6Y0T52WTS6RM
## 11 2019-07-27 08:21:34 5WS2S3D9KY3F5H9Y
## 12 2019-07-27 08:21:34 24SZW5NATAADAY9Q
## 13 2019-07-27 08:21:34 BEG51QCKXPM8ZS16
## 14 2019-07-27 08:21:34 1ZC1QY7RCJR9VJ0B
## 15 2019-07-27 08:21:34 CJ50Z4FXM4HW6XWG
## 16 2019-07-27 08:21:34 ER84JP8WTXV5DWV8
## 17 2019-07-27 08:21:34 W3ABA82GZSRXN1RB
## 18 2019-07-27 08:21:34 AA60CYFGR8832JD6
## 19 2019-07-27 08:21:34 6W5ARCFHH6T75FPZ
## 20 2019-07-27 08:21:34 5WT4XNP7NS69BM3X

(ut <- ts_generate(as.POSIXct("2017-11-01 15:00:00", origin="1970-01-01")))
## [1] "0001CZM6DG2THKSAX3F1SF30E7"

unmarshal(ut)
##                    ts              rnd
## 1 2017-11-01 15:00:00 2THKSAX3F1SF30E7

Millisecond Resolution

As per issue #13 on the upstream repo, time is actually encoded mostly as time_t leading to second rather than millisecond resolution. Two patches by Chris Brove also collected in his fork improve on this by using std::chrono objects internally. In release 0.4.0, we have switches to his fork and extended the wrapper functions to support this:

> library(ulid)
> gen_ulid <- \(sleep) replicate(5, {Sys.sleep(sleep); generate()})
> u <- gen_ulid(.1)
> df <- unmarshal(u)
> data.table::data.table(df)
                        ts              rnd
                    <POSc>           <char>
1: 2024-05-30 16:38:28.588 CSQAJBPNX75R0G5A
2: 2024-05-30 16:38:28.688 XZX0TREDHD6PC1YR
3: 2024-05-30 16:38:28.789 0YK9GKZVTED27QMK
4: 2024-05-30 16:38:28.890 SC3M3G6KGPH7S50S
5: 2024-05-30 16:38:28.990 TSKCBWJ3TEKCPBY0
>

Author

Suyash Verma wrote the C++ header library ulid.

Chris Bove updated internals to permit sub-second resolution in his fork.

Bob Rudis created the R package, prepared versions 0.1.0 and 0.2.0, and released version 0.3.0 to CRAN.

Dirk Eddelbuettel has been maintainer since release 0.3.1.

License

The package is licensed under the MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.github/workflows		.github/workflows
R		R
inst		inst
local		local
man		man
src		src
tests		tests
.Rbuildignore		.Rbuildignore
.editorconfig		.editorconfig
.gitignore		.gitignore
ChangeLog		ChangeLog
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md
cleanup		cleanup

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ulid -- Universally Unique Lexicographically Sortable Identifiers

Description

Components

Sorting

What Is Supported:

Installation

Usage

Millisecond Resolution

Author

License

About

Contributors 2

Languages

License

eddelbuettel/ulid

Folders and files

Latest commit

History

Repository files navigation

ulid -- Universally Unique Lexicographically Sortable Identifiers

Description

Components

Sorting

What Is Supported:

Installation

Usage

Millisecond Resolution

Author

License

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

Languages