The Case of the Hidden Mongo Data

Over here at Vena, we share a docker-compose.yml file to help new developers set up their environments smoothly. I’ve edited this file to mount the Mongo and MySQL data to an accessible location on the developer’s local machine:

mongo:
 image: mongo
   volumes:
     — ../../../data/mongo:/data/db
     — ../../../backups:/backups
   ports:
     — “127.0.0.1:27017:27017”
mysql:
 image: mysql:5.6.33
   volumes:
     — ../../../data/mysql:/var/lib/mysql

Over the last few months I’ve been hearing about issues with Mongo containers losing their restored data and developers needing to restore all of their Mongo data. This can possibly take hours.

The question is why is the data disappearing. This does not happen with MySQL. I have mounted the Docker volume to my local disk and I expect the Mongo data to be stored in that specified local disk location.

MongoDB stores the data on the disk as BSON in your data path directory. Since I did not supply a config file for my Mongo Docker container, Docker should store these files in default into data/db.

Let’s check this out. If I enter the local disk folder I mounted in the container (data/mongo/db), I see that it is empty. None of the data seems to be stored in this folder as I would think.

Let’s check inside the container itself. I enter it, run bash, enter the */data/db *folder and find tons of files including a storage.bson file. (Note: WiredTiger is the default storage engine for Mongo
https://docs.mongodb.com/manual/core/wiredtiger/)

So it looks like I suspected and this folder is not being properly mounted to my local drive. If I go to inspect the running container I find that the */data/db*directory in my container is actually mounted somewhere else:

"Type": "volume",

"Name": "fb529a5a01731942c1535d2ae010b22ab4c35e24b6f1af253cf25944a2e2ac65",

"Source": "/var/lib/docker/volumes/fb529a5a01731942c1535d2ae010b22ab4c35e24b6f1af253cf25944a2e2ac65/_data",

"Destination": "/data/db",

"Driver": "local",

"Mode": "rw",

"RW": true,

"Propagation": ""

This path:

“/var/lib/docker/volumes/fb529a5a01731942c1535d2ae010b22ab4c35e24b6f1af253cf25944a2e2ac65/_data” 

is in my Virtual Hard Disk located for me in C:\Users\Public\Documents\Hyper-V\Virtual hard disks.

Docker Lesson: Just because you mount a directory doesn’t mean its sub-directories are necessarily mounted

This is counter intuitive for me. Usually, sub-directories are mounted. If I create folder in the mounted local path:

I see it in right away in my Docker container:

So what’s special about the data/db directory and why isn’t it mounted to where I expect?

Let’s do an experiment by directly mounting my local Mongo directory to the data/db folder in the container. Ie: changing my yml file to include:

 — ../../../data/mongo/db:/data/db

I remove my existing container and start up my container. It doesn’t work. Lets check out the logs of this container:

C:\Users\CTabis\Vena\devops\docker\full-stack>docker container logs e890c9fe6ff1
2018-10-12T19:34:37.243+0000 I CONTROL  [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'
2018-10-12T19:34:37.256+0000 I CONTROL  [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 64-bit host=e890c9fe6ff1
2018-10-12T19:34:37.256+0000 I CONTROL  [initandlisten] db version v4.0.1
2018-10-12T19:34:37.256+0000 I CONTROL  [initandlisten] git version: 54f1582fc6eb01de4d4c42f26fc133e623f065fb
2018-10-12T19:34:37.256+0000 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.2g  1 Mar 2016
2018-10-12T19:34:37.256+0000 I CONTROL  [initandlisten] allocator: tcmalloc
2018-10-12T19:34:37.256+0000 I CONTROL  [initandlisten] modules: none
2018-10-12T19:34:37.256+0000 I CONTROL  [initandlisten] build environment:
2018-10-12T19:34:37.256+0000 I CONTROL  [initandlisten]     distmod: ubuntu1604
2018-10-12T19:34:37.256+0000 I CONTROL  [initandlisten]     distarch: x86_64
2018-10-12T19:34:37.256+0000 I CONTROL  [initandlisten]     target_arch: x86_64
2018-10-12T19:34:37.256+0000 I CONTROL  [initandlisten] options: { net: { bindIpAll: true } }
2018-10-12T19:34:37.272+0000 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=478M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),statistics_log=(wait=0),verbose=(recovery_progress),
2018-10-12T19:34:37.904+0000 E STORAGE  [initandlisten] WiredTiger error (1) [1539372877:904805][1:0x7f67cdc36a00], connection: /data/db/WiredTiger.wt: handle-open: open: Operation not permitted Raw: [1539372877:904805][1:0x7f67cdc36a00], connection: /data/db/WiredTiger.wt: handle-open: open: Operation not permitted
2018-10-12T19:34:37.920+0000 E STORAGE  [initandlisten] WiredTiger error (17) [1539372877:920575][1:0x7f67cdc36a00], connection: /data/db/WiredTiger.wt: handle-open: open: File exists Raw: [1539372877:920575][1:0x7f67cdc36a00], connection: /data/db/WiredTiger.wt: handle-open: open: File exists
2018-10-12T19:34:37.924+0000 I STORAGE  [initandlisten] WiredTiger message unexpected file WiredTiger.wt found, renamed to WiredTiger.wt.1
2018-10-12T19:34:37.926+0000 E STORAGE  [initandlisten] WiredTiger error (1) [1539372877:926675][1:0x7f67cdc36a00], connection: /data/db/WiredTiger.wt: handle-open: open: Operation not permitted Raw: [1539372877:926675][1:0x7f67cdc36a00], connection: /data/db/WiredTiger.wt: handle-open: open: Operation not permitted
2018-10-12T19:34:37.939+0000 E STORAGE  [initandlisten] WiredTiger error (17) [1539372877:939310][1:0x7f67cdc36a00], connection: /data/db/WiredTiger.wt: handle-open: open: File exists Raw: [1539372877:939310][1:0x7f67cdc36a00], connection: /data/db/WiredTiger.wt: handle-open: open: File exists
2018-10-12T19:34:37.944+0000 I STORAGE  [initandlisten] WiredTiger message unexpected file WiredTiger.wt found, renamed to WiredTiger.wt.2
2018-10-12T19:34:37.946+0000 E STORAGE  [initandlisten] WiredTiger error (1) [1539372877:946721][1:0x7f67cdc36a00], connection: /data/db/WiredTiger.wt: handle-open: open: Operation not permitted Raw: [1539372877:946721][1:0x7f67cdc36a00], connection: /data/db/WiredTiger.wt: handle-open: open: Operation not permitted
2018-10-12T19:34:37.948+0000 F STORAGE  [initandlisten] Failed to start up WiredTiger under any compatibility version.
2018-10-12T19:34:37.949+0000 F STORAGE  [initandlisten] 1: Operation not permitted
2018-10-12T19:34:37.949+0000 F -        [initandlisten] Fatal Assertion 28595 at src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp 194
2018-10-12T19:34:37.949+0000 F -        [initandlisten]

***aborting after fassert() failure

After much googling I find:

*IMPORTANT *MongoDB requires a filesystem that supports fsync() on directories. For example, HGFS and Virtual Box’s shared folders do not support this operation.

Therefore, where MongoDB stores its BSON data files will not directly mount to any of my local directories. This means that each time a docker container is recreated the data is stored elsewhere and I don’t have access to it.

Great. So one solution is to never lose your Mongo Docker container. This is not ideal or guaranteed. We should be able to delete and spin up multiple containers using the same image and expect them to persist data.

How else can I access the same persisted Mongo data throughout multiple docker container instances?

Another solution is to create an external volume so we can have a reference to the same place in the the HyperDrive. Do so like this in your command line:

docker volume create mongo_external

Then in your docker-compose.yml create a reference to that external volume in a separate section in your yml file. For reference in using a top level volumes key:

https://docs.docker.com/compose/compose-file/#volume-configuration-reference

volumes:
   mongoData:
     external:
       name: mongo_external

Then, also in the .yml file, reference the top level volume key as your mounted volume, making sure you map it to the Mongo data folder data/db:

 mongo:
   image: mongo
     volumes:
       - mongoData:/data/db
       - ../../../backups:/backups

Now each time a Mongo container is created, it references the same place.

This does not actually mount to anywhere in your local disk. You still can’t see your files in local disk, but now your data will not be lost each time you recreate a container 🙂