Foundations --> CNN clarifications #31

New Issue

GiteaMirror · 2025-11-02T00:01:54-05:00

GiteaMirror commented

2025-11-02 00:01:54 -05:00

Originally created by @gitgithan on GitHub (Oct 19, 2021).

Under Modelling there is a sequence of 3D diagrams showing the flow of shapes. It seems that the vocab_size dimension disappeared after the convolution step. From the earlier gifs showing convolution, they only use integers in each cell instead of a one hot encoded vector. I was hoping for some explanation of where the vocab_size dimension went during convolution, like what kind of aggregation happened there.
If there were annotations of the shapes as pytorch requires (including the manual axis 1,2 transpose) under each step will be very helpful. I had been trying to see the shapes throughout the flow using torchsummary.summary(model,(500,8,1)) but no matter what pattern i try it gives ValueError: too many values to unpack (expected 1).
It is breaking at user-defined code which is strange because i thought it should be torchsummary's issue. If i try to turn this 3-tuple into a single integer, then this user-code passes but torchsummary breaks saying integer is not iterable.

Does torchsummary work by sending random values through the pipeline to get the shapes and that's why it has to run user-code and that's why i see this unpacking error? How do I use properly torchsummary to view CNN shapes?

     19 
     20         # Rearrange input so num_channels is in dim 1 (N, C, L)
---> 21         x_in, = inputs
     22         if not channel_first:
     23             x_in = x_in.transpose(1, 2)

Originally created by @gitgithan on GitHub (Oct 19, 2021). 1. Under Modelling there is a sequence of 3D diagrams showing the flow of shapes. It seems that the vocab_size dimension disappeared after the convolution step. From the earlier gifs showing convolution, they only use integers in each cell instead of a one hot encoded vector. I was hoping for some explanation of where the vocab_size dimension went during convolution, like what kind of aggregation happened there. 2. If there were annotations of the shapes as pytorch requires (including the manual axis 1,2 transpose) under each step will be very helpful. I had been trying to see the shapes throughout the flow using `torchsummary.summary(model,(500,8,1))` but no matter what pattern i try it gives `ValueError: too many values to unpack (expected 1)`. It is breaking at user-defined code which is strange because i thought it should be torchsummary's issue. If i try to turn this 3-tuple into a single integer, then this user-code passes but torchsummary breaks saying integer is not iterable. Does torchsummary work by sending random values through the pipeline to get the shapes and that's why it has to run user-code and that's why i see this unpacking error? How do I use properly torchsummary to view CNN shapes? ``` 19 20 # Rearrange input so num_channels is in dim 1 (N, C, L) ---> 21 x_in, = inputs 22 if not channel_first: 23 x_in = x_in.transpose(1, 2) ```

GiteaMirror closed this issue

2025-11-02 00:01:54 -05:00

GiteaMirror commented

2025-11-02 00:01:55 -05:00

@GokuMohandas commented on GitHub (Oct 19, 2021):

Since the filter has the same depth as the input, the output will now have the depth as the # of filters. This diagram might provide more insight: https://raw.githubusercontent.com/GokuMohandas/MadeWithML/main/images/foundations/cnn/conv.png
I remember trying to use torchsummary couple years and ran into issues so I removed them here. I'll take a look to see if any changes have been made to make this possible.

@GokuMohandas commented on GitHub (Oct 19, 2021): 1. Since the filter has the same depth as the input, the output will now have the depth as the # of filters. This diagram might provide more insight: https://raw.githubusercontent.com/GokuMohandas/MadeWithML/main/images/foundations/cnn/conv.png 2. I remember trying to use torchsummary couple years and ran into issues so I removed them here. I'll take a look to see if any changes have been made to make this possible.

GiteaMirror referenced this issue

2025-11-02 00:04:20 -05:00

[PR #31] [MERGED] started computer vision notebook #105

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/Made-With-ML#31