Dylan's BI Study Notes

My notes about Business Intelligence, Data Warehousing, OLAP, and Master Data Management

Use Bit to represent groups

Posted by Dylan Wan on October 11, 2017

Here I am providing an alternate approach of supporting group membership in MySQL.

It is a common seen requirement that a group may have multiple members and a person may be added to multiple groups.  This many to many relationship is typically modeled in an intersection table.

When the group membership is being used as a filter, for example, to show all the students within a group, it becomes a subquery.

Groups

If the number of groups is fixed and not beyond certain number, such as 64, actually, the group can be represented as bit in a bitmap and the check can be done using bit operator.

We first assume that groups are stored in a group table and we assign a bit to each group in a bitmap:

create table dw_groups as
select group_name, c from 
(select "group1" group_name, POW(2,0) c
union
select "group2", POW(2,1)
union
select "group3", POW(2,2)
union
select "group4", POW(2,3)
union
select "group5", POW(2,4)
union
select "group6", POW(2,5)
union
...
union
select "group36", POW(2,35)
union
select "group37", POW(2,36)
union
select "group38", POW(2,37)
union
select "group39", POW(2,38)
union
select "group40", POW(2,39)
) x;

I use the POWER function above.  It is equivalent to using a bit literal.

SELECT CAST(b'0000000000000000000000000000000000000001' AS UNSIGNED);

This is POWER(2,0).  POWER(2,39) is equal to:

SELECT CAST(b'1000000000000000000000000000000000000000' AS UNSIGNED);
-- 549755813888

We can also double check using this:

select bin(549755813888);

 

Student Group Membership

We can still create an intersection table:

create table dw_student_memberships
( student_id bigint(20)
, group_name varchar(7)
);

Add the student 1 to three groups:

insert into dw_student_memberships values (1, "group38");
insert into dw_student_memberships values (1, "group22");
insert into dw_student_memberships values (1, "group17");

 

Students and the flattened list

Assume that the student table also holds the list of groups that a student is a member of.  The group membership can also be represented in a bitmap:

create table dw_students
( student_id bigint(20)
, groups double
);

Here BIT_OR is an aggregate function.  It basically add the assigned group bitmap together to have a single bitmap to represent the memberships.

insert into dw_students (student_id, groups)
select student_id, bit_or(g.c)
from dw_student_memberships sm
inner join dw_groups g
on sm.group_name = g.group_name
group by student_id;

 

Check if a Student is in a group

& is the operator for “Bitwise AND”.

It can be used to get the overlapped members between two list or check if a group is a member of the list.

select s.student_id, g.group_name
from dw_students s
inner join dw_groups g
on s.groups & c;

Since three groups are in the membership column “groups”, the SQL will return three rows based on the intersection table we populated.

 

 

Leave a comment