10-10-2022, 01:48 AM | #1 |
Evangelist
Posts: 461
Karma: 82692
Join Date: May 2021
Device: kindle
|
India Today Magazine update
it stopped working.. they changed stuff in their website.
|
10-11-2022, 02:07 AM | #2 |
Evangelist
Posts: 461
Karma: 82692
Join Date: May 2021
Device: kindle
|
Indian Express
remove tags update.. they keep coming up with new tags.
|
Advert | |
|
10-11-2022, 04:33 AM | #3 |
Evangelist
Posts: 461
Karma: 82692
Join Date: May 2021
Device: kindle
|
economic times print edition
cover url method update
|
10-20-2022, 05:35 AM | #4 |
Evangelist
Posts: 461
Karma: 82692
Join Date: May 2021
Device: kindle
|
eenadu_ap recipe cover_url update
https://github.com/kovidgoyal/calibr...nadu_ap.recipe Code:
def get_cover_url(self): from datetime import date cover = 'https://img.kiosko.net/' + str( date.today().year ) + '/' + date.today().strftime('%m') + '/' + date.today( ).strftime('%d') + '/in/eenadu.750.jpg' br = BasicNewsRecipe.get_browser(self) try: br.open(cover) except: index = 'https://es.kiosko.net/in/np/eenadu.html' soup = self.index_to_soup(index) for image in soup.findAll('img', src=True): if image['src'].endswith('750.jpg'): return 'https:' + image['src'] self.log("\nCover unavailable") cover = None return cover |
10-22-2022, 03:01 AM | #5 |
Evangelist
Posts: 461
Karma: 82692
Join Date: May 2021
Device: kindle
|
India Today Magazine articles have long dashes between words which show up as —
adding encoding utf-8 didn't work. So I added.. Code:
def preprocess_raw_html(self, raw_html, url): return raw_html.replace('—', '--') maybe add the code to the recipe! also this bold part Code:
extra_css = ''' #sub-d {font-style:italic; color:#202020;} .story__byline {font-size:small; text-align:left;} .body_caption, .mos__alt, .caption, .caption-drupal-entity {font-size:small; text-align:center;} blockquote{color:#404040;} ''' Last edited by unkn0wn; 10-22-2022 at 03:37 AM. |
Advert | |
|
10-22-2022, 05:16 AM | #6 |
creator of calibre
Posts: 43,993
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
If encoding utf-8 didnt work then they likely arent encoded in utf-8. So you will need the correct encoding. Common ones to try are cp1252 and latin1
|
10-22-2022, 01:45 PM | #7 |
Evangelist
Posts: 461
Karma: 82692
Join Date: May 2021
Device: kindle
|
I tried encoding = 'cp1252'
this makes the long dash show up as � and makes a lot of text unreadable. i think the replace solution is much better than figuring out encoding, the problem is only with em dash & they use a lot of them. also tried latin1 .. doesn;t work Last edited by unkn0wn; 10-22-2022 at 02:04 PM. |
10-22-2022, 11:42 PM | #8 |
creator of calibre
Posts: 43,993
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
ok, fine by me.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Outlook Business Magazine (India) update | unkn0wn | Recipes | 0 | 08-13-2022 11:06 AM |
Update India Today | unkn0wn | Recipes | 8 | 07-25-2022 04:11 AM |
Caravan Magazine India Error In The New Update | abhix3 | Recipes | 2 | 07-18-2020 10:43 PM |
Caravan Magazine India | abhix3 | Recipes | 8 | 07-01-2020 05:54 AM |
Frontline Magazine India | Yash912 | Recipes | 0 | 01-06-2014 04:07 AM |