ImageNet in Dolt
ImageNet is a dataset maintained by the Stanford Vision Lab. It seems to have fallen into disrepair. The links to download the image labels are broken. We have managed to procure all four released versions of the labeled images and import them into Dolt. We've also included a copy of WordNet in the ImageNet repository to make querying easier.
For a quick primer on WordNet in Dolt, see our earlier blog entry.
ImageNet as we imported it into Dolt is simply an additional table keyed by synset_type
, synset_id
, and image_id
. There is an additional column with a URL to the image. You can get access to the images either through the URLs or other means. This table allows you to query for images labeled by a synset. WordNet has relationships between synsets so getting all the images of animals would involve walking the synset graph using the synset_pointers
table to all hyponym nodes below animal. This word hierarchy makes gathering labeled image sets very interesting.
First, in order to get the ImageNet dataset onto your box, install dolt, run dolt login
, and then run dolt clone dolthub/image-net
. To run SQL, cd image-net
and run dolt sql -q "<query>"
. With Dolt, you can run SQL after one command! No CSVs. No custom import logic.
For the purposes of this article, using ImageNet in Dolt, I will explore the classic computer vision problem, hot dog or not hot dog.
image-net$ dolt sql -q "select * from words_synsets where word='hot dog'"
+---------+--------+-----------+-------------+----------+
| word | lex_id | synset_id | synset_type | word_num |
+---------+--------+-----------+-------------+----------+
| hot dog | 0 | 10187710 | n | 2 |
| hot dog | 1 | 07676602 | n | 4 |
| hot dog | 2 | 07697537 | n | 2 |
+---------+--------+-----------+-------------+----------+
image-net$ dolt sql -q "select * from synsets where synset_type='n' and synset_id='10187710' or synset_id='07676602' or synset_id='07697537'"
+-----------+-------------+---------+-----------------------------------------------------------------------------------------------+
| synset_id | synset_type | lex_num | gloss |
+-----------+-------------+---------+-----------------------------------------------------------------------------------------------+
| 07676602 | n | 13 | a smooth-textured sausage of minced beef or pork usually smoked; often served on a bread roll |
| 07697537 | n | 13 | a frankfurter served hot on a bun |
| 10187710 | n | 18 | someone who performs dangerous stunts to attract attention to himself |
+-----------+-------------+---------+-----------------------------------------------------------------------------------------------+
image-net$ dolt sql -q "select count(*) from images_synsets where synset_type='n' and synset_id='07676602' or synset_id='07697537'"
+----------+
| COUNT(*) |
+----------+
| 1257 |
+----------+
image-net$ dolt sql -q "select * from images_synsets where synset_type='n' and synset_id='07676602' or synset_id='07697537' limit 5"
+-------------+-----------+----------+---------------------------------------------------------------+
| synset_type | synset_id | image_id | image_url |
+-------------+-----------+----------+---------------------------------------------------------------+
| n | 07697537 | 30 | http://www.loafnjug.com/images/hot-dog-and-tea.jpg |
| n | 07697537 | 53 | http://farm1.static.flickr.com/91/220588966_8350522b9a.jpg |
| n | 07697537 | 88 | http://farm3.static.flickr.com/2200/2252143352_1f628be218.jpg |
| n | 07697537 | 112 | http://farm2.static.flickr.com/1411/722638089_cd4a75d59a.jpg |
| n | 07697537 | 128 | http://farm4.static.flickr.com/3645/3396903223_f8601dcdd7.jpg |
+-------------+-----------+----------+---------------------------------------------------------------+
You can train an image classifier using the 1,257 images of hot dogs and ~13M images of not hot dogs.
Let's say you want to include all types of sausages not just hot dogs in your training set. Sausage is a hypernym of hot dog.
image-net$ dolt sql -q "select * from synset_pointers where from_synset_type='n' and from_synset_id='07676602' and pointer_type_symbol='@'"
+----------------+------------------+--------------+----------------+---------------------+------------------+-----------------+---------------+-------------+
| from_synset_id | from_synset_type | to_synset_id | to_synset_type | pointer_type_symbol | semantic_pointer | lexical_pointer | from_word_num | to_word_num |
+----------------+------------------+--------------+----------------+---------------------+------------------+-----------------+---------------+-------------+
| 07676602 | n | 07675627 | n | @ | true | false | 0 | 0 |
+----------------+------------------+--------------+----------------+---------------------+------------------+-----------------+---------------+-------------+
image-net$ dolt sql -q "select * from words_synsets where synset_type='n' and synset_id='07675627'"
+---------+--------+-----------+-------------+----------+
| word | lex_id | synset_id | synset_type | word_num |
+---------+--------+-----------+-------------+----------+
| sausage | 0 | 07675627 | n | 1 |
+---------+--------+-----------+-------------+----------+
Unfortunately, ImageNet contains no additional images labeled with sausage but sausage has a bunch of hyponyms.
image-net$ dolt sql -q "select * from synset_pointers where from_synset_type='n' and from_synset_id='07675627' and pointer_type_symbol='~'"
+----------------+------------------+--------------+----------------+---------------------+------------------+-----------------+---------------+-------------+
| from_synset_id | from_synset_type | to_synset_id | to_synset_type | pointer_type_symbol | semantic_pointer | lexical_pointer | from_word_num | to_word_num |
+----------------+------------------+--------------+----------------+---------------------+------------------+-----------------+---------------+-------------+
| 07675627 | n | 07676121 | n | ~ | true | false | 0 | 0 |
| 07675627 | n | 07676273 | n | ~ | true | false | 0 | 0 |
| 07675627 | n | 07676425 | n | ~ | true | false | 0 | 0 |
| 07675627 | n | 07676520 | n | ~ | true | false | 0 | 0 |
| 07675627 | n | 07676602 | n | ~ | true | false | 0 | 0 |
| 07675627 | n | 07677071 | n | ~ | true | false | 0 | 0 |
| 07675627 | n | 07677255 | n | ~ | true | false | 0 | 0 |
| 07675627 | n | 07677360 | n | ~ | true | false | 0 | 0 |
| 07675627 | n | 07677480 | n | ~ | true | false | 0 | 0 |
| 07675627 | n | 07677593 | n | ~ | true | false | 0 | 0 |
| 07675627 | n | 07677747 | n | ~ | true | false | 0 | 0 |
| 07675627 | n | 07678313 | n | ~ | true | false | 0 | 0 |
+----------------+------------------+--------------+----------------+---------------------+------------------+-----------------+---------------+-------------+
Those also contain no additional labeled images.
image-net$ dolt sql -q "select count(*) from images_synsets where (synset_id='07676121' or synset_id='07676273' or synset_id='07676425' or synset_id='07676520' or synset_id='07676602' or synset_id='07677071' or synset_id='07677255' or synset_id='07677360' or synset_id='07677480' or synset_id='07677593' or synset_id='07677747' or synset_id='07678313' or synset_id='07697537')"
+----------+
| COUNT(*) |
+----------+
| 1257 |
+----------+
However, this gives you an idea of how to walk the WordNet graph to expand your labeled set. Maybe this is an opportunity to create a new branch of ImageNet and label some sausage images?
As an aside, many of the queries above could be accomplished more efficiently using SQL joins. We're working on join performance right now. Expect joins to get a lot faster and thus more usable on large datasets in the next few months.
Remember, we were able to start exploring ImageNet data in the way above with one clone command. No CSV imports. No worrying about database versions. Get Dolt and start playing with the ImageNet dataset for yourself.