Insun Lee

命由己造,相由心生,世间万物皆是化相,心不动,万物皆不动,心不变,万物皆不变
WechatinsunLee
Email insunsgmail.com
Country China
LocationHangZhou, ZheJiang

用Golang和PHP生成mongodb的24位的id


mongodb id的用途

  1. 在分布式存储上作为全局唯一id使用;
  2. mogodb转为其他数据库,比如pgsql时,沿用原先的id规则;
  3. 需要隐藏真实数量的地方,比如用mysql存储用户,其自增id很容易看出用户数量;
  4. 防止某种形式的机器攻击,比如根据mysql、pgsql等自增id很容易通过机器生成url,在做采集时很方便。

mongodb的生成规则

mongodb根据版本不同,生成规则有部分调整,不同的地方在中间的10位(5 byte)。根据mongodb的文档,在mongodb3.2以下的ObjectId生成规则:8位的时间戳 + 6位的机器标志 + 4位的进程id + 6位的自增计数

a 4-byte value representing the seconds since the Unix epoch,
a 3-byte machine identifier,
a 2-byte process id, and
a 3-byte counter, starting with a random value.

mongodb3.4及以上的版本的ObjectId生成规则:8位的时间戳 + 10位的随机码 + 6位的自增计数

a 4-byte timestamp value, representing the ObjectId’s creation, measured in seconds since the Unix epoch
a 5-byte random value
a 3-byte incrementing counter, initialized to a random value

PHP生成mongodb id

下列算法是mongodb3.2之前的ObjectId生成算法,来自于网上,参考注释部分

/**
     * 生成mongodb3.2级以下版本的mongo ObjectId,用于获得某个数据表的字符串自增id(全局唯一),每个数据表的起始id都是1。id的组成结构参考下列信息:
     *
     * Generate an ID, constructed using:
     *   a 4-byte value representing the seconds since the Unix epoch,
     *   a 3-byte machine identifier,
     *   a 2-byte process id, and
     *   a 3-byte counter, starting with a random value.
     * Just like a MongoId string.
     *
     * @link http://docs.mongodb.org/manual/reference/object-id/
     * @author http://luokr.com/p/16
     * @param int $lastId 上一个id,当不提供(值为null时,系统自动从1开始)
     * @return string 24 hexidecimal characters
     */
    function uniqId($lastId = 0) {
        return sprintf(
            "%08x%06x%04x%06x",
            /* 4-byte value representing the seconds since the Unix epoch. */
            time() & 0xFFFFFFFF,
            /* 3-byte machine identifier.
             *
             * On windows, the max length is 256. Linux doesn't have a limit, but it
             * will fill in the first 256 chars of hostname even if the actual
             * hostname is longer.
             *
             * From the GNU manual:
             * gethostname stores the beginning of the host name in name even if the
             * host name won't entirely fit. For some purposes, a truncated host name
             * is good enough. If it is, you can ignore the error code.
             *
             * crc32 will be better than Times33. */
            crc32(substr((string)gethostname(), 0, 256)) >> 8 & 0xFFFFFF,
            /* 2-byte process id. */
            getmypid() & 0xFFFF,
            /* 3-byte counter, starting with a random value. */
            $lastId = $lastId > 0xFFFFFE ? 1 : $lastId + 1
        );
    }

下列php代码用于生成mongodb3.4级以上版本的ObjectId

/**
     * 生成mongodb3.4及以上版本的mongo ObjectId.
     * 用于获得某个数据表的字符串自增id(全局唯一),每个数据表的起始id都是1。id的组成结构参考下列信息:
     *
     * @author https://insuns.com
     * @param int $lastId
     *
     * @return string
     */
    function uniqId4($lastId = 0) {
        return sprintf(
            "%08x%010x%06x",
            // a 4-byte timestamp value, representing the ObjectId’s creation, measured in seconds since the Unix epoch
            time() & 0xFFFFFFFF,
            //a 5-byte random value
            // !!!这个值应该从存储到缓存,每次从缓存获取!!!
            mt_rand(0, 0xFFFFFFFFFF) ,
            //a 3-byte incrementing counter, initialized to a random value
            $lastId = $lastId > 0xFFFFFE ? 1 : $lastId + 1
        );
    }

golang生成mongodb ObjectId

测试地址:https://play.golang.org/p/zS8-r8kikbm

注意:go playground中生成mongodb3.4以上版本的id每次都一样,经观察应该是随机数生成一致的原因,本人本机测试没问题。
/**
 * 生成全局唯一的24位id
 * File: uuid.go
 * Package: uuid
 * Author: https://insuns.com/blog/YongGOLANGHePHPShengChengMONGODBDe24WeiDeID/5.html
 * Date: 2021-03-03 09:52:04
 */
package uuid
import (
    "fmt"
    "hash/crc32"
    "math/rand"
    "os"
    "time"
)

//GetUniqueId 生成mongodb3.2以下版本的mongo ObjectId.
//用于获得某个数据表的字符串自增id(全局唯一),每个数据表的起始id都是1。id的组成结构参考下列信息:
//
// Generate an ID, constructed using:
//   a 4-byte value representing the seconds since the Unix epoch,
//   a 3-byte machine identifier,
//   a 2-byte process id, and
//   a 3-byte counter, starting with a random value.
// Just like a MongoId string.
func GetUniqueId(lastID ...int64) string {
    var currentId int64 = 1
    if len(lastID) > 0 && lastID[0] < 0xFFFFFF {
        currentId = lastID[0] + 1
    }
    host, err := os.Hostname()
    if err != nil {
        host = "localhost"
    }

    return fmt.Sprintf(
        "%08x%06x%04x%06x",
        time.Now().Unix()&0xFFFFFFFF,
        crc32.ChecksumIEEE([]byte(host))>>8&0xFFFFFF,
        os.Getpid()&0xFFFF,
        currentId)
}

var rd *rand.Rand = rand.New(rand.NewSource(time.Now().UnixNano()))
var midRand int64 = rd.Int63n(0xFFFFFFFF)
//GetUniqueId4 生成mongodb3.4及以上版本的mongo ObjectId.
//用于获得某个数据表的字符串自增id(全局唯一),每个数据表的起始id都是1。id的组成结构参考下列信息:
// a 4-byte timestamp value, representing the ObjectId’s creation, measured in seconds since the Unix epoch
// a 5-byte random value
// a 3-byte incrementing counter, initialized to a random value
func GetUniqueId4(lastID ...int64) string {
    var currentId int64 = 1
    if len(lastID) > 0 && lastID[0] < 0xFFFFFF {
        currentId = lastID[0] + 1
    }
    return fmt.Sprintf(
        "%08x%010x%06x",
        time.Now().Unix()&0xFFFFFFFF,
        midRand,
        currentId)
}

基准测试结果

cpu: Intel(R) Core(TM) i5-6267U CPU @ 2.90GHz
BenchmarkMongoID-4        1000000000             0.0000186 ns/op           0 B/op           0 allocs/op
BenchmarkMongoId4-4       1000000000             0.0000155 ns/op           0 B/op           0 allocs/op

小结

实际上我自己使用时,传入的参数是用到这个id的表的名称,而lastId则通过内存缓存存储,每次要用的时候从缓存获取并加一。

基准测试结果

  • 分享:
评论

    • 博主

    说点什么