Load a model in a session

You can import models from other sessions within your code using the NSML library.It works by loading another session’s model with nsml.load and storing it with nsml.save, and then create a dummy session to terminate.

Let’s take a look at it with baseline examples.

First, execute the following command with the nsml run command.

$ ls
README.md      data_loader.py main.py        setup.py

$ nsml run -d mnist
INFO[2019/07/19 15:06:04.170] .nsmlignore check - start
INFO[2019/07/19 15:06:04.170] .nsmlignore check - done
INFO[2019/07/19 15:06:04.235] file integrity check - start
INFO[2019/07/19 15:06:04.237] file integrity check - done
INFO[2019/07/19 15:06:04.238] .nsmlignore 16 B - start
INFO[2019/07/19 15:06:04.238] .nsmlignore 16 B - done (1/5 20.00%) (16 B/18 KiB 0.09%)
INFO[2019/07/19 15:06:04.238] README.md 9.1 KiB - start
INFO[2019/07/19 15:06:04.239] README.md 9.1 KiB - done (2/5 40.00%) (9.2 KiB/18 KiB 50.32%)
INFO[2019/07/19 15:06:04.239] data_loader.py 1.5 KiB - start
INFO[2019/07/19 15:06:04.239] data_loader.py 1.5 KiB - done (3/5 60.00%) (11 KiB/18 KiB 58.57%)
INFO[2019/07/19 15:06:04.239] main.py 7.3 KiB - start
INFO[2019/07/19 15:06:04.239] main.py 7.3 KiB - done (4/5 80.00%) (18 KiB/18 KiB 98.81%)
INFO[2019/07/19 15:06:04.239] setup.py 221 B - start
INFO[2019/07/19 15:06:04.239] setup.py 221 B - done (5/5 100.00%) (18 KiB/18 KiB 100.00%)
.....
Building docker image. It might take for a while
......
Session nsml_team/mnist/48 is started

If you check the newly created 48th session using model list, you can see the list of models created as follows

$ nsml model ls nsml_team/mnist/48
Checkpoint    Last Modified    Elapsed    Summary                                                                    Size
------------  ---------------  ---------  -------------------------------------------------------------------------  ---------
0             36 minutes ago   3.397      epoch_total=5, loss=7.083731204539806, acc=0.0019707207207207205, epoch=0  366.74 MB
1             36 minutes ago   24.667     epoch_total=5, loss=6.765417760556883, acc=0.00563063063063063, epoch=1    366.74 MB
2             35 minutes ago   24.624     epoch_total=5, loss=6.254474949192357, acc=0.02294481981981982, epoch=2    366.74 MB
3             35 minutes ago   24.751     epoch_total=5, loss=5.404983241278846, acc=0.08727477477477477, epoch=3    366.74 MB
4             34 minutes ago   24.630     epoch_total=5, loss=4.27992379557979, acc=0.21494932432432431, epoch=4     366.74 MB

Let’s create a new session by loading the 4th checkpoint model in the 48th session using nsml.load function with the three lines of code.

Add the following three lines of code to the main.py of the baseline

nsml.load(checkpoint='4', session='nsml_team/mnist/48')
nsml.save('saved')
exit()

Specify the index of checkpoint and session name to load in load() function, and save with save() function. The best position of writing the code is below of Trainmode = True as shown in the below example. If you use the code at the other positions, you should insert the code after calling the bind_model(model) function.

bind_model(model)

if config.pause:
    nsml.paused(scope=locals())

bTrainmode = False
if config.mode == 'train':
    bTrainmode = True

    # the three lines of load/save source codes are positioned at below.
    nsml.load(checkpoint='4', session='nsml_team/mnist/48')
    nsml.save('saved')
    exit()
    # If you want to place it to the other line, you should put it below the bind_model() function,

    """ Initiate RMSprop optimizer """
    opt = keras.optimizers.rmsprop(lr=0.00045, decay=1e-6)
    model.compile(loss='categorical_crossentropy',
                  optimizer=opt,
                  metrics=['accuracy'])

Let’s try to run the session again. Session 49 begins.

$ nsml run -d mnist
INFO[2019/07/19 15:10:10.803] .nsmlignore check - start
INFO[2019/07/19 15:10:10.804] .nsmlignore check - done
INFO[2019/07/19 15:10:10.942] file integrity check - start
INFO[2019/07/19 15:10:10.944] file integrity check - done
INFO[2019/07/19 15:10:10.946] .nsmlignore 16 B - start
INFO[2019/07/19 15:10:10.946] .nsmlignore 16 B - done (1/5 20.00%) (16 B/18 KiB 0.09%)
INFO[2019/07/19 15:10:10.946] README.md 9.1 KiB - start
INFO[2019/07/19 15:10:10.946] README.md 9.1 KiB - done (2/5 40.00%) (9.2 KiB/18 KiB 50.02%)
INFO[2019/07/19 15:10:10.946] data_loader.py 1.5 KiB - start
INFO[2019/07/19 15:10:10.947] data_loader.py 1.5 KiB - done (3/5 60.00%) (11 KiB/18 KiB 58.23%)
INFO[2019/07/19 15:10:10.947] main.py 7.4 KiB - start
INFO[2019/07/19 15:10:10.947] main.py 7.4 KiB - done (4/5 80.00%) (18 KiB/18 KiB 98.82%)
INFO[2019/07/19 15:10:10.947] setup.py 221 B - start
INFO[2019/07/19 15:10:10.947] setup.py 221 B - done (5/5 100.00%) (18 KiB/18 KiB 100.00%)
.....
Building docker image. It might take for a while
......
Session nsml_team/mnist/49 is started

If you find the following two statements in the log, you can see that the model has been loaded correctly.

$ nsml logs nsml_team/mnist/49
...
model loaded!
model saved!
...

A checkpoint named ‘saved’ is created. This is because we named it when calling the nsml.save() function.

$ nsml model ls nsml_team/mnist/49
Checkpoint    Last Modified    Elapsed    Summary    Size
------------  ---------------  ---------  ---------  ---------
saved         20 minutes ago   0.000                 366.74 MB

You can submit your model to the leaderboard using the nsml submit command in the same way as any other session.

$ nsml submit nsml_team/mnist/49 saved
........
Building docker image. It might take for a while
.............
Score: 0.012391527150908917
Done