The Case of the Hidden Mongo Data

Over here at Vena, we share a docker-compose.yml file to help new developers set
up their environments smoothly. I’ve edited this file to mount the Mongo and
MySQL data to an accessible location on the developer’s local machine:

mongo:
 image: mongo
   volumes:
     — ../../../data/mongo:/data/db
     — ../../../backups:/backups
   ports:
     — “127.0.0.1:27017:27017”
mysql:
 image: mysql:5.6.33
   volumes:
     — ../../../data/mysql:/var/lib/mysql

Over the last few months I’ve been hearing about issues with Mongo containers
losing their restored data and developers needing to restore all of their Mongo
data. This can possibly take hours.

The question is why is the data
disappearing. This does not happen with MySQL. I have mounted the Docker volume
to my local disk and I expect the Mongo data to be stored in that specified
local disk location.

MongoDB stores the data on the disk as BSON in
your data path directory.

https://www.mongodb.com/json-and-bson. Since I did not supply a config file for my Mongo Docker container, Docker
should store these files in default into data/db.

Let’s check this
out. If I enter the local disk folder I mounted in the container
(data/mongo/db), I see that it is empty. None of the data seems to be stored in this folder as I would think.

Let’s check inside the container itself. I enter it, run bash, enter the
*/data/db *folder and find tons of files including a storage.bson file. (Note: WiredTiger is the default storage engine for Mongo
https://docs.mongodb.com/manual/core/wiredtiger/)

So it looks like I suspected and this folder is not being properly mounted to my
local drive. If I go to inspect the running container I find that the */data/db
*directory in my container is actually mounted somewhere else:

"Type": "volume",

"Name": "fb529a5a01731942c1535d2ae010b22ab4c35e24b6f1af253cf25944a2e2ac65",

"Source": "/var/lib/docker/volumes/fb529a5a01731942c1535d2ae010b22ab4c35e24b6f1af253cf25944a2e2ac65/_data",

"Destination": "/data/db",

"Driver": "local",

"Mode": "rw",

"RW": true,

"Propagation": ""

This path:

“/var/lib/docker/volumes/fb529a5a01731942c1535d2ae010b22ab4c35e24b6f1af253cf25944a2e2ac65/_data” 

is in my Virtual Hard Disk located for me in
C:\Users\Public\Documents\Hyper-V\Virtual hard disks.

Docker Lesson:
Just because you mount a directory doesn’t mean its

sub-directories are necessarily mounted

This is counter intuitive for me. Usually, sub-directories are mounted. If I
create folder in the mounted local path:

I see it in right away in my Docker container:

So what’s special about the data/db directory and why isn’t it mounted to
where I expect?

Let’s do an experiment by directly mounting my local Mongo directory to the
data/db folder in the container. Ie: changing my yml file to include:

 — ../../../data/mongo/db:/data/db

I remove my existing container and start up my container. It doesn’t work. Lets
check out the logs of this container:

C:\Users\CTabis\Vena\devops\docker\full-stack>docker container logs e890c9fe6ff1
2018-10-12T19:34:37.243+0000 I CONTROL  [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'
2018-10-12T19:34:37.256+0000 I CONTROL  [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 64-bit host=e890c9fe6ff1
2018-10-12T19:34:37.256+0000 I CONTROL  [initandlisten] db version v4.0.1
2018-10-12T19:34:37.256+0000 I CONTROL  [initandlisten] git version: 54f1582fc6eb01de4d4c42f26fc133e623f065fb
2018-10-12T19:34:37.256+0000 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.2g  1 Mar 2016
2018-10-12T19:34:37.256+0000 I CONTROL  [initandlisten] allocator: tcmalloc
2018-10-12T19:34:37.256+0000 I CONTROL  [initandlisten] modules: none
2018-10-12T19:34:37.256+0000 I CONTROL  [initandlisten] build environment:
2018-10-12T19:34:37.256+0000 I CONTROL  [initandlisten]     distmod: ubuntu1604
2018-10-12T19:34:37.256+0000 I CONTROL  [initandlisten]     distarch: x86_64
2018-10-12T19:34:37.256+0000 I CONTROL  [initandlisten]     target_arch: x86_64
2018-10-12T19:34:37.256+0000 I CONTROL  [initandlisten] options: { net: { bindIpAll: true } }
2018-10-12T19:34:37.272+0000 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=478M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),statistics_log=(wait=0),verbose=(recovery_progress),
2018-10-12T19:34:37.904+0000 E STORAGE  [initandlisten] WiredTiger error (1) [1539372877:904805][1:0x7f67cdc36a00], connection: /data/db/WiredTiger.wt: handle-open: open: Operation not permitted Raw: [1539372877:904805][1:0x7f67cdc36a00], connection: /data/db/WiredTiger.wt: handle-open: open: Operation not permitted
2018-10-12T19:34:37.920+0000 E STORAGE  [initandlisten] WiredTiger error (17) [1539372877:920575][1:0x7f67cdc36a00], connection: /data/db/WiredTiger.wt: handle-open: open: File exists Raw: [1539372877:920575][1:0x7f67cdc36a00], connection: /data/db/WiredTiger.wt: handle-open: open: File exists
2018-10-12T19:34:37.924+0000 I STORAGE  [initandlisten] WiredTiger message unexpected file WiredTiger.wt found, renamed to WiredTiger.wt.1
2018-10-12T19:34:37.926+0000 E STORAGE  [initandlisten] WiredTiger error (1) [1539372877:926675][1:0x7f67cdc36a00], connection: /data/db/WiredTiger.wt: handle-open: open: Operation not permitted Raw: [1539372877:926675][1:0x7f67cdc36a00], connection: /data/db/WiredTiger.wt: handle-open: open: Operation not permitted
2018-10-12T19:34:37.939+0000 E STORAGE  [initandlisten] WiredTiger error (17) [1539372877:939310][1:0x7f67cdc36a00], connection: /data/db/WiredTiger.wt: handle-open: open: File exists Raw: [1539372877:939310][1:0x7f67cdc36a00], connection: /data/db/WiredTiger.wt: handle-open: open: File exists
2018-10-12T19:34:37.944+0000 I STORAGE  [initandlisten] WiredTiger message unexpected file WiredTiger.wt found, renamed to WiredTiger.wt.2
2018-10-12T19:34:37.946+0000 E STORAGE  [initandlisten] WiredTiger error (1) [1539372877:946721][1:0x7f67cdc36a00], connection: /data/db/WiredTiger.wt: handle-open: open: Operation not permitted Raw: [1539372877:946721][1:0x7f67cdc36a00], connection: /data/db/WiredTiger.wt: handle-open: open: Operation not permitted
2018-10-12T19:34:37.948+0000 F STORAGE  [initandlisten] Failed to start up WiredTiger under any compatibility version.
2018-10-12T19:34:37.949+0000 F STORAGE  [initandlisten] 1: Operation not permitted
2018-10-12T19:34:37.949+0000 F -        [initandlisten] Fatal Assertion 28595 at src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp 194
2018-10-12T19:34:37.949+0000 F -        [initandlisten]

***aborting after fassert() failure

After much googling I find:

*IMPORTANT *MongoDB requires a filesystem that supports fsync() on directories. For example, HGFS and Virtual Box’s shared folders do not support this operation.

Therefore, where MongoDB stores its BSON data files will not directly mount to
any of my local directories. This means that each time a docker container is
recreated the data is stored elsewhere and I don’t have access to it.

Great. So one solution is to never lose your Mongo Docker container. This is not
ideal or guaranteed. We should be able to delete and spin up multiple containers
using the same image and expect them to persist data.

How else can I access the same persisted Mongo data throughout multiple docker

container instances?

Another solution is to create an external volume so we can have a reference to
the same place in the the HyperDrive. Do so like this in your command line:

docker volume create mongo_external

Then in your docker-compose.yml create a reference to that external volume in a
separate section in your yml file.
For reference in using a top level
volumes key:

https://docs.docker.com/compose/compose-file/#volume-configuration-reference

volumes:
   mongoData:
     external:
       name: mongo_external

Then, also in the .yml file, reference the top level volume key as your mounted
volume, making sure you map it to the Mongo data folder data/db:

 mongo:
   image: mongo
     volumes:
       - mongoData:/data/db
       - ../../../backups:/backups

Now each time a Mongo container is created, it references the same place.

This does not actually mount to anywhere in your local disk. You still can’t see
your files in local disk, but now your data will not be lost each time you
recreate a container :)