(Translated by https://www.hiragana.jp/)
MobileNet-3D V1 - Wolfram Neural Net Repository

MobileNet-3D V1 Trained on Video Datasets

Identify the action in a video

Released in 2019, this family of nets consists of three-dimensional (3D) versions of the original MobileNet V1 architecture for video classification. Using a combination of depthwise separable convolutions and 3D convolutions, these light and efficient models achieve much better video classification accuracies compared to their two-dimensional counterparts.

Number of models: 8

Training Set Information

Performance

Examples

Resource retrieval

Get the pre-trained net:

In[1]:=
NetModel["MobileNet-3D V1 Trained on Video Datasets"]
Out[1]=

NetModel parameters

This model consists of a family of individual nets, each identified by a specific parameter combination. Inspect the available parameters:

In[2]:=
NetModel["MobileNet-3D V1 Trained on Video Datasets", "ParametersInformation"]
Out[2]=

Pick a non-default net by specifying the parameters:

In[3]:=
NetModel[{"MobileNet-3D V1 Trained on Video Datasets", "Dataset" -> "Jester", "Width" -> 1.0}]
Out[3]=

Pick a non-default uninitialized net:

In[4]:=
NetModel[{"MobileNet-3D V1 Trained on Video Datasets", "Dataset" -> "Jester", "Width" -> 1.5}, "UninitializedEvaluationNet"]
Out[4]=

Basic usage

Identify the main action in a video:

In[5]:=
yoga = ResourceData["Sample Video: Practicing Yoga"];
In[6]:=
NetModel["MobileNet-3D V1 Trained on Video Datasets"][yoga]
Out[6]=

Obtain the probabilities of the 10 most likely entities predicted by the net:

In[7]:=
NetModel["MobileNet-3D V1 Trained on Video Datasets"][yoga, {"TopProbabilities", 10}]
Out[7]=

Obtain the list of names of all available classes:

In[8]:=
NetExtract[NetModel["MobileNet-3D V1 Trained on Video Datasets"], "Output"][["Labels"]]
Out[8]=

Resource History

Reference