[PR #2501] Add hazm #1820

Open
opened 2025-11-06 13:23:51 -06:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/vinta/awesome-python/pull/2501
Author: @ayub-kokabi
Created: 8/1/2023
Status: 🔄 Open

Base: masterHead: patch-1


📝 Commits (1)

📊 Changes

1 file changed (+3 additions, -1 deletions)

View changed files

📝 README.md (+3 -1)

📄 Description

What is this Python project?

Hazm is a python library to perform natural language processing tasks on Persian text. It offers various features for analyzing, processing, and understanding Persian text. You can use Hazm to normalize text, tokenize sentences and words, lemmatize words, assign part-of-speech tags, identify dependency relations, create word and sentence embeddings, or read popular Persian corpora.

Features:

  • Normalization: Converts text to a standard form, such as removing diacritics, correcting spacing, etc.
  • Tokenization: Splits text into sentences and words.
  • Lemmatization: Reduces words to their base forms.
  • POS tagging: Assigns a part of speech to each word.
  • Dependency parsing: Identifies the syntactic relations between words.
  • Embedding: Creates vector representations of words and sentences.
  • Persian corpora reading: Easily read popular Persian corpora with ready-made scripts and minimal code.

What's the difference between this Python project and similar ones?

As far as my knowledge goes, there are no other libraries that can match the utility provided by this package.

Anyone who agrees with this pull request could submit an Approve review to it.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/vinta/awesome-python/pull/2501 **Author:** [@ayub-kokabi](https://github.com/ayub-kokabi) **Created:** 8/1/2023 **Status:** 🔄 Open **Base:** `master` ← **Head:** `patch-1` --- ### 📝 Commits (1) - [`15dcaf4`](https://github.com/vinta/awesome-python/commit/15dcaf4d29e1dcf3010e3249f6e954aa395e96b6) Add hazm ### 📊 Changes **1 file changed** (+3 additions, -1 deletions) <details> <summary>View changed files</summary> 📝 `README.md` (+3 -1) </details> ### 📄 Description ## What is this Python project? [Hazm](https://www.roshan-ai.ir/hazm/) is a python library to perform natural language processing tasks on Persian text. It offers various features for analyzing, processing, and understanding Persian text. You can use Hazm to normalize text, tokenize sentences and words, lemmatize words, assign part-of-speech tags, identify dependency relations, create word and sentence embeddings, or read popular Persian corpora. **Features:** - Normalization: Converts text to a standard form, such as removing diacritics, correcting spacing, etc. - Tokenization: Splits text into sentences and words. - Lemmatization: Reduces words to their base forms. - POS tagging: Assigns a part of speech to each word. - Dependency parsing: Identifies the syntactic relations between words. - Embedding: Creates vector representations of words and sentences. - Persian corpora reading: Easily read popular Persian corpora with ready-made scripts and minimal code. ## What's the difference between this Python project and similar ones? As far as my knowledge goes, there are no other libraries that can match the utility provided by this package. Anyone who agrees with this pull request could submit an *Approve* review to it. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2025-11-06 13:23:51 -06:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/awesome-python#1820