.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/visualkeras_NLP/plot_nlp_mpnet_with_tf_layers.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. or to run this example in your browser via JupyterLite or Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_visualkeras_NLP_plot_nlp_mpnet_with_tf_layers.py: visualkeras: transformers example ========================================== An example showing the :py:func:`~scikitplot.visualkeras` function used by a :py:class:`~tensorflow.keras.Model` or :py:class:`~torch.nn.Module` or :py:class:`~transformers.TFPreTrainedModel` model. .. GENERATED FROM PYTHON SOURCE LINES 9-13 .. code-block:: Python :lineno-start: 10 # Authors: The scikit-plots developers # SPDX-License-Identifier: BSD-3-Clause .. GENERATED FROM PYTHON SOURCE LINES 14-15 pip install protobuf==5.29.4 .. GENERATED FROM PYTHON SOURCE LINES 15-24 .. code-block:: Python :lineno-start: 15 import tensorflow as tf # Clear any session to reset the state of TensorFlow/Keras tf.keras.backend.clear_session() from transformers import TFAutoModel from scikitplot import visualkeras .. GENERATED FROM PYTHON SOURCE LINES 25-26 Load the Hugging Face transformer model .. GENERATED FROM PYTHON SOURCE LINES 26-36 .. code-block:: Python :lineno-start: 26 transformer_model = TFAutoModel.from_pretrained("microsoft/mpnet-base") # Define a Keras-compatible wrapper for the Hugging Face model def wrap_transformer_model(inputs): input_ids, attention_mask = inputs outputs = transformer_model(input_ids=input_ids, attention_mask=attention_mask) return outputs.last_hidden_state # Return the last hidden state for visualization .. rst-class:: sphx-glr-script-out .. code-block:: none TensorFlow and JAX classes are deprecated and will be removed in Transformers v5. We recommend migrating to PyTorch classes or pinning your version of Transformers. Some weights of the PyTorch model were not used when initializing the TF 2.0 model TFMPNetModel: ['lm_head.layer_norm.weight', 'lm_head.decoder.weight', 'lm_head.dense.bias', 'lm_head.dense.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.bias', 'lm_head.bias'] - This IS expected if you are initializing TFMPNetModel from a PyTorch model trained on another task or with another architecture (e.g. initializing a TFBertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing TFMPNetModel from a PyTorch model that you expect to be exactly identical (e.g. initializing a TFBertForSequenceClassification model from a BertForSequenceClassification model). Some weights or buffers of the TF 2.0 model TFMPNetModel were not initialized from the PyTorch model and are newly initialized: ['mpnet.pooler.dense.weight', 'mpnet.pooler.dense.bias'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. .. GENERATED FROM PYTHON SOURCE LINES 37-38 Define Keras model inputs .. GENERATED FROM PYTHON SOURCE LINES 38-105 .. code-block:: Python :lineno-start: 38 input_ids = tf.keras.Input(shape=(128,), dtype=tf.int32, name="input_ids") attention_mask = tf.keras.Input(shape=(128,), dtype=tf.int32, name="attention_mask") # Pass inputs through the transformer model using a Lambda layer last_hidden_state = tf.keras.layers.Lambda( wrap_transformer_model, output_shape=(128, 768), # Explicitly specify the output shape name="microsoft_mpnet-base", )([input_ids, attention_mask]) # Reshape the output to fit into Conv2D (adding extra channel dimension) inside a Lambda layer # def reshape_last_hidden_state(x): # return tf.reshape(x, (-1, 1, 128, 768)) # reshaped_output = tf.keras.layers.Lambda(reshape_last_hidden_state)(last_hidden_state) # Use Reshape layer to reshape the output to fit into Conv2D (adding extra channel dimension) # Reshape to (batch_size, 128, 768, 1) for Conv2D input reshaped_output = tf.keras.layers.Reshape((-1, 128, 768))(last_hidden_state) # Add different layers to the model x = tf.keras.layers.Conv2D( 512, (3, 3), activation="relu", padding="same", name="conv2d_1" )(reshaped_output) x = tf.keras.layers.BatchNormalization(name="batchnorm_1")(x) x = tf.keras.layers.Dropout(0.3, name="dropout_1")(x) x = tf.keras.layers.MaxPooling2D(pool_size=(2, 2), name="maxpool_1")(x) x = tf.keras.layers.Conv2D( 256, (3, 3), activation="relu", padding="same", name="conv2d_2" )(x) x = tf.keras.layers.BatchNormalization(name="batchnorm_2")(x) x = tf.keras.layers.Dropout(0.3, name="dropout_2")(x) x = tf.keras.layers.MaxPooling2D(pool_size=(2, 2), name="maxpool_2")(x) x = tf.keras.layers.Conv2D( 128, (3, 3), activation="relu", padding="same", name="conv2d_3" )(x) x = tf.keras.layers.BatchNormalization(name="batchnorm_3")(x) x = tf.keras.layers.Dropout(0.4, name="dropout_3")(x) x = tf.keras.layers.MaxPooling2D(pool_size=(2, 2), name="maxpool_3")(x) # Add GlobalAveragePooling2D before the Dense layers x = tf.keras.layers.GlobalAveragePooling2D(name="globalaveragepool")(x) # Add Dense layers x = tf.keras.layers.Dense(512, activation="relu", name="dense_1")(x) x = tf.keras.layers.Dropout(0.5, name="dropout_4")(x) x = tf.keras.layers.Dense(128, activation="relu", name="dense_2")(x) # Add output layer (classification head) dummy_output = tf.keras.layers.Dense( 2, activation="softmax", name="dummy_classification_head" )(x) # Wrap into a Keras model wrapped_model = tf.keras.Model(inputs=[input_ids, attention_mask], outputs=dummy_output) # https://github.com/keras-team/keras/blob/v3.3.3/keras/src/models/model.py#L217 # https://github.com/keras-team/keras/blob/master/keras/src/utils/summary_utils.py#L121 wrapped_model.summary( line_length=None, positions=None, print_fn=None, expand_nested=False, show_trainable=True, layer_range=None, ) .. rst-class:: sphx-glr-script-out .. code-block:: none Model: "functional" ┏━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ Connected to ┃ Trai… ┃ ┡━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━┩ │ input_ids │ (None, 128) │ 0 │ - │ - │ │ (InputLayer) │ │ │ │ │ ├───────────────────┼─────────────────┼───────────┼────────────────┼───────┤ │ attention_mask │ (None, 128) │ 0 │ - │ - │ │ (InputLayer) │ │ │ │ │ ├───────────────────┼─────────────────┼───────────┼────────────────┼───────┤ │ microsoft_mpnet-… │ (None, 128, │ 0 │ input_ids[0][… │ - │ │ (Lambda) │ 768) │ │ attention_mas… │ │ ├───────────────────┼─────────────────┼───────────┼────────────────┼───────┤ │ reshape (Reshape) │ (None, 1, 128, │ 0 │ microsoft_mpn… │ - │ │ │ 768) │ │ │ │ ├───────────────────┼─────────────────┼───────────┼────────────────┼───────┤ │ conv2d_1 (Conv2D) │ (None, 1, 128, │ 3,539,456 │ reshape[0][0] │ Y │ │ │ 512) │ │ │ │ ├───────────────────┼─────────────────┼───────────┼────────────────┼───────┤ │ batchnorm_1 │ (None, 1, 128, │ 2,048 │ conv2d_1[0][0] │ Y │ │ (BatchNormalizat… │ 512) │ │ │ │ ├───────────────────┼─────────────────┼───────────┼────────────────┼───────┤ │ dropout_1 │ (None, 1, 128, │ 0 │ batchnorm_1[0… │ - │ │ (Dropout) │ 512) │ │ │ │ ├───────────────────┼─────────────────┼───────────┼────────────────┼───────┤ │ maxpool_1 │ (None, 0, 64, │ 0 │ dropout_1[0][… │ - │ │ (MaxPooling2D) │ 512) │ │ │ │ ├───────────────────┼─────────────────┼───────────┼────────────────┼───────┤ │ conv2d_2 (Conv2D) │ (None, 0, 64, │ 1,179,904 │ maxpool_1[0][… │ Y │ │ │ 256) │ │ │ │ ├───────────────────┼─────────────────┼───────────┼────────────────┼───────┤ │ batchnorm_2 │ (None, 0, 64, │ 1,024 │ conv2d_2[0][0] │ Y │ │ (BatchNormalizat… │ 256) │ │ │ │ ├───────────────────┼─────────────────┼───────────┼────────────────┼───────┤ │ dropout_2 │ (None, 0, 64, │ 0 │ batchnorm_2[0… │ - │ │ (Dropout) │ 256) │ │ │ │ ├───────────────────┼─────────────────┼───────────┼────────────────┼───────┤ │ maxpool_2 │ (None, 0, 32, │ 0 │ dropout_2[0][… │ - │ │ (MaxPooling2D) │ 256) │ │ │ │ ├───────────────────┼─────────────────┼───────────┼────────────────┼───────┤ │ conv2d_3 (Conv2D) │ (None, 0, 32, │ 295,040 │ maxpool_2[0][… │ Y │ │ │ 128) │ │ │ │ ├───────────────────┼─────────────────┼───────────┼────────────────┼───────┤ │ batchnorm_3 │ (None, 0, 32, │ 512 │ conv2d_3[0][0] │ Y │ │ (BatchNormalizat… │ 128) │ │ │ │ ├───────────────────┼─────────────────┼───────────┼────────────────┼───────┤ │ dropout_3 │ (None, 0, 32, │ 0 │ batchnorm_3[0… │ - │ │ (Dropout) │ 128) │ │ │ │ ├───────────────────┼─────────────────┼───────────┼────────────────┼───────┤ │ maxpool_3 │ (None, 0, 16, │ 0 │ dropout_3[0][… │ - │ │ (MaxPooling2D) │ 128) │ │ │ │ ├───────────────────┼─────────────────┼───────────┼────────────────┼───────┤ │ globalaveragepool │ (None, 128) │ 0 │ maxpool_3[0][… │ - │ │ (GlobalAveragePo… │ │ │ │ │ ├───────────────────┼─────────────────┼───────────┼────────────────┼───────┤ │ dense_1 (Dense) │ (None, 512) │ 66,048 │ globalaverage… │ Y │ ├───────────────────┼─────────────────┼───────────┼────────────────┼───────┤ │ dropout_4 │ (None, 512) │ 0 │ dense_1[0][0] │ - │ │ (Dropout) │ │ │ │ │ ├───────────────────┼─────────────────┼───────────┼────────────────┼───────┤ │ dense_2 (Dense) │ (None, 128) │ 65,664 │ dropout_4[0][… │ Y │ ├───────────────────┼─────────────────┼───────────┼────────────────┼───────┤ │ dummy_classifica… │ (None, 2) │ 258 │ dense_2[0][0] │ Y │ │ (Dense) │ │ │ │ │ └───────────────────┴─────────────────┴───────────┴────────────────┴───────┘ Total params: 5,149,954 (19.65 MB) Trainable params: 5,148,162 (19.64 MB) Non-trainable params: 1,792 (7.00 KB) .. GENERATED FROM PYTHON SOURCE LINES 106-107 Visualize the wrapped model .. GENERATED FROM PYTHON SOURCE LINES 107-128 .. code-block:: Python :lineno-start: 107 img_nlp_mpnet_with_tf_layers = visualkeras.layered_view( wrapped_model, legend=True, show_dimension=True, min_z=1, min_xy=1, max_z=4096, max_xy=4096, scale_z=1, scale_xy=1, font={"font_size": 99}, text_callable="default", # to_file="result_images/nlp_mpnet_with_tf_layers.png", save_fig=True, save_fig_filename="nlp_mpnet_with_tf_layers.png", overwrite=False, add_timestamp=True, verbose=True, ) img_nlp_mpnet_with_tf_layers .. image-sg:: /auto_examples/visualkeras_NLP/images/sphx_glr_plot_nlp_mpnet_with_tf_layers_001.png :alt: plot nlp mpnet with tf layers :srcset: /auto_examples/visualkeras_NLP/images/sphx_glr_plot_nlp_mpnet_with_tf_layers_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none [INFO] Saving path to: /home/circleci/repo/galleries/examples/visualkeras_NLP/result_images/nlp_mpnet_with_tf_layers_20250629_134309Z.png .. GENERATED FROM PYTHON SOURCE LINES 129-137 .. tags:: model-type: classification model-workflow: model building plot-type: visualkeras domain: neural network level: advanced purpose: showcase .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 11.673 seconds) .. _sphx_glr_download_auto_examples_visualkeras_NLP_plot_nlp_mpnet_with_tf_layers.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/scikit-plots/scikit-plots/main?urlpath=lab/tree/notebooks/auto_examples/visualkeras_NLP/plot_nlp_mpnet_with_tf_layers.ipynb :alt: Launch binder :width: 150 px .. container:: lite-badge .. image:: images/jupyterlite_badge_logo.svg :target: ../../lite/lab/index.html?path=auto_examples/visualkeras_NLP/plot_nlp_mpnet_with_tf_layers.ipynb :alt: Launch JupyterLite :width: 150 px .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_nlp_mpnet_with_tf_layers.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_nlp_mpnet_with_tf_layers.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_nlp_mpnet_with_tf_layers.zip ` .. include:: plot_nlp_mpnet_with_tf_layers.recommendations .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_