[Pinterest] Downloaded pin image has low resolution #838

Open
opened 2025-11-09 09:59:32 -06:00 by GiteaMirror · 3 comments
Owner

Originally created by @iambtshft on GitHub (May 28, 2025).

Brief

When I try to download image from Pinterest, returned result sometimes has low resolution. Example (https://www.pinterest.com/pin/70437489341156/).

Technical analysis

Here's the parser line 4b9644ebdf/api/src/processing/services/pinterest.js (L34-L38)

The service takes the first picture with proper extension matched by Regex. However, for the specified example picture the first picture is not of best quality, see output

[0]  {src="https://i.pinimg.com/236x/7c/0a/1c/7c0a1c5f1c999a4a67f3c5b847da093c.jpg"}  
[1]  {src="https://i.pinimg.com/736x/7c/0a/1c/7c0a1c5f1c999a4a67f3c5b847da093c.jpg"}  
[2]  {src="https://i.pinimg.com/75x75_RS/9e/

Potential solution

Option 1 - Lookup for better resolution

I'm not expert in how Pinterest structures the data, but from names looks like it's possible to get image identifier part from first image 7c/0a/1c/7c0a1c5f1c999a4a67f3c5b847da093c.jpg and lookup for better image with the same id but better resolution {vvv}x

Option 2 - Parse images from json

When I was investigating page content I found that besides images provided as src=<something> there's a json structured pin data. It has much more information, such as original image URL (that is not present in src=<> pattern)

<script data-relay-response="true" type="application/json">
      {
      <OMITTTED>
                "imageSpec_236x": {
                  "height": 295,
                  "width": 236,
                  "url": "https://i.pinimg.com/236x/7c/0a/1c/7c0a1c5f1c999a4a67f3c5b847da093c.jpg"
                },
                "imageSpec_orig": {
                  "url": "https://i.pinimg.com/originals/7c/0a/1c/7c0a1c5f1c999a4a67f3c5b847da093c.jpg"
                },
    <OMITTTED>

Not sure again if such data is available for every pin, but it looks like a more robust solution while src parsing could be used as fallback

reproduction steps

  1. Go to cobalt.tools
  2. Insert https://www.pinterest.com/pin/70437489341156/
  3. Hit download

Actual result: Image has low quality
Expected result: Image has the same quality as on pinterest page.

screenshots

https://www.pinterest.com/pin/70437489341156/

platform information

additional context

Originally created by @iambtshft on GitHub (May 28, 2025). ## Brief When I try to download image from Pinterest, returned result sometimes has low resolution. Example (`https://www.pinterest.com/pin/70437489341156/`). ## Technical analysis Here's the parser line https://github.com/imputnet/cobalt/blob/4b9644ebdfbfe7bc6f7ec2d476692e3619cb59bd/api/src/processing/services/pinterest.js#L34-L38 The service takes the first picture with proper extension matched by Regex. However, for the specified example picture the first picture is not of best quality, see output ``` [0] {src="https://i.pinimg.com/236x/7c/0a/1c/7c0a1c5f1c999a4a67f3c5b847da093c.jpg"} [1] {src="https://i.pinimg.com/736x/7c/0a/1c/7c0a1c5f1c999a4a67f3c5b847da093c.jpg"} [2] {src="https://i.pinimg.com/75x75_RS/9e/ ``` ## Potential solution ### Option 1 - Lookup for better resolution I'm not expert in how Pinterest structures the data, but from names looks like it's possible to get image identifier part from first image `7c/0a/1c/7c0a1c5f1c999a4a67f3c5b847da093c.jpg` and lookup for better image with the same id but better resolution `{vvv}x` ### Option 2 - Parse images from json When I was investigating page content I found that besides images provided as `src=<something>` there's a json structured pin data. It has much more information, such as original image URL (that is not present in `src=<>` pattern) ``` <script data-relay-response="true" type="application/json"> { <OMITTTED> "imageSpec_236x": { "height": 295, "width": 236, "url": "https://i.pinimg.com/236x/7c/0a/1c/7c0a1c5f1c999a4a67f3c5b847da093c.jpg" }, "imageSpec_orig": { "url": "https://i.pinimg.com/originals/7c/0a/1c/7c0a1c5f1c999a4a67f3c5b847da093c.jpg" }, <OMITTTED> ``` Not sure again if such data is available for every pin, but it looks like a more robust solution while src parsing could be used as fallback ### reproduction steps 1. Go to cobalt.tools 2. Insert `https://www.pinterest.com/pin/70437489341156/` 3. Hit download Actual result: Image has low quality Expected result: Image has the same quality as on pinterest page. ### screenshots - ### links ```shell https://www.pinterest.com/pin/70437489341156/ ``` ### platform information - ### additional context -
GiteaMirror added the bug label 2025-11-09 09:59:32 -06:00
Author
Owner

@agvantibo-again commented on GitHub (Jul 10, 2025):

+1, reproduced accidentally with https://pinterest.com/pin/333618284916219545

Downloaded image was 236x236, original image is 736x736

@agvantibo-again commented on GitHub (Jul 10, 2025): +1, reproduced accidentally with https://pinterest.com/pin/333618284916219545 Downloaded image was 236x236, original image is 736x736
Author
Owner

@potatolover68 commented on GitHub (Aug 12, 2025):

After further digging(testing on this), it seems that on every image there's a script tag named "PWS_INITIAL_PROPS" that has a list of image sizes, including the original.

https://regex101.com/r/IAmYqE/1

Image
const matchdigits = /(\d+)/gm;
JSON.parse(document.getElementById("__PWS_INITIAL_PROPS__").innerText).initialReduxState.pins[document.URL.match(matchdigits)[0]].images

Note that, as far as I've tested, this only works for when you're signed in - otherwise the "pins" object is empty

@potatolover68 commented on GitHub (Aug 12, 2025): After further digging(testing on [this](https://www.pinterest.com/pin/29695678788111907/)), it seems that on every image there's a script tag named "__PWS_INITIAL_PROPS__" that has a list of image sizes, including the original. https://regex101.com/r/IAmYqE/1 <img width="1085" height="398" alt="Image" src="https://github.com/user-attachments/assets/835cb9ed-d17a-4acb-b229-28d4b2b592f3" /> ```javascript const matchdigits = /(\d+)/gm; JSON.parse(document.getElementById("__PWS_INITIAL_PROPS__").innerText).initialReduxState.pins[document.URL.match(matchdigits)[0]].images ``` > Note that, as far as I've tested, this only works for when you're signed in - otherwise the "pins" object is empty
Author
Owner

@potatolover68 commented on GitHub (Aug 13, 2025):

After even more further digging(testing on this), when you're not signed in, you can use the following regex:

let p = /https:\/\/i.pinimg.com\/(\d{3}x)\/[0-9a-f/]{41}\.jpg/gm;
[...new Set(document.body.innerHTML.match(p))];

to match all the image URLs.

Image
pitfalls
  • This is time-sensitive, so it's best to run when the page is just loaded in; otherwise, it can't differentiate between the endless scroll content and the main content.

  • Note that the first image in the list is always the main content; perhaps this could be used to filter the list

@potatolover68 commented on GitHub (Aug 13, 2025): After even more further digging(testing on [this](https://www.pinterest.com/pin/69665125479478586/)), when you're not signed in, you can use the following regex: ```javascript let p = /https:\/\/i.pinimg.com\/(\d{3}x)\/[0-9a-f/]{41}\.jpg/gm; [...new Set(document.body.innerHTML.match(p))]; ``` to match all the image URLs. <img width="1105" height="208" alt="Image" src="https://github.com/user-attachments/assets/61002c41-86dd-4f16-8384-784ca6821a41" /> <details> <summary>pitfalls</summary> - This is time-sensitive, so it's best to run when the page is just loaded in; otherwise, it can't differentiate between the endless scroll content and the main content. - Note that the first image in the list is always the main content; perhaps this could be used to filter the list </details>
Sign in to join this conversation.