mirror of
https://github.com/GokuMohandas/Made-With-ML.git
synced 2026-03-09 07:12:37 -05:00
num_classes vs num_tokens #42
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @Aman0807 on GitHub (Nov 24, 2022).
The following padding function used in https://madewithml.com/courses/foundations/convolutional-neural-networks/ refers to
num_classeswhich in the example used comes up to 500. I was wondering if it should be referred asnum_tokens(as used in other functions). Just getting confused since as per my understandingnum_classes = 4.@GokuMohandas commented on GitHub (Nov 25, 2022):
Hi @Aman0807, this is a great question and is actually a major source of bugs even in very mature companies. Typically, when I write a function, I want to make it as modular and reusable as possible. So that I can use it for my current context but also for future contexts for my team. In our task here, the number of classes happens to be the same as the number of tokens but in someone else's task, they may still require padding but tokens may not make any sense to them if they're not doing exactly our task. So when writing functions, it's a good practice to use names that are universal (but still descriptive).
I've often seen functions written using specific variables that exist for the current context it's been written for. And then when others reuse the same function (copy/paste or from an internal ML library) they get errors because that specific variable may not exist or it doesn't make sense for their context.