Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 01-10-2021, 06:37 AM   #61
wrCisco
Enthusiast
wrCisco ought to be getting tired of karma fortunes by now.wrCisco ought to be getting tired of karma fortunes by now.wrCisco ought to be getting tired of karma fortunes by now.wrCisco ought to be getting tired of karma fortunes by now.wrCisco ought to be getting tired of karma fortunes by now.wrCisco ought to be getting tired of karma fortunes by now.wrCisco ought to be getting tired of karma fortunes by now.wrCisco ought to be getting tired of karma fortunes by now.wrCisco ought to be getting tired of karma fortunes by now.wrCisco ought to be getting tired of karma fortunes by now.wrCisco ought to be getting tired of karma fortunes by now.
 
Posts: 34
Karma: 467802
Join Date: Apr 2016
Device: none
With the new registered QMetaType and the support in EmbeddedPython.cpp, the direct conversion from QList<int> to QVariant now works great, thanks!

As for the @import rule, the problem now is if I write something plainly wrong, like
Code:
@import { src: url(cssprova.css); }
the parser doesn't raise an error, and the serialization ends up with something like this:
Code:
@import {;

}
wrCisco is offline   Reply With Quote
Old 01-10-2021, 09:41 AM   #62
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,736
Karma: 5446592
Join Date: Nov 2009
Device: many
Yes it is not a fully validating parser (yet). Adding checks on specific direct internal state transitions that are not allowed will catch most of those. In other words transitioning from in import directly to in selector or in property would help detect those cases.

But the older CSSInfo did no validation at all, and incorrectly parsed valid css at times!

I will look into it. Thanks!
KevinH is online now   Reply With Quote
Advert
Old 01-10-2021, 01:31 PM   #63
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,736
Karma: 5446592
Join Date: Nov 2009
Device: many
Okay, the very latest version of cssparser2 (not yet posted) will now happily and detect errors of using a "{" in the @import, @charset, and @namespace and report them while trying to continue on gracefully without losing any css.

I have also removed a couple of unused functions from CSSUtils.cpp and .h

I will post a cssparser_v2.1.zip later in the week when I get more free time.

Thanks!

Quote:
Originally Posted by wrCisco View Post
With the new registered QMetaType and the support in EmbeddedPython.cpp, the direct conversion from QList<int> to QVariant now works great, thanks!

As for the @import rule, the problem now is if I write something plainly wrong, like
Code:
@import { src: url(cssprova.css); }
the parser doesn't raise an error, and the serialization ends up with something like this:
Code:
@import {;

}
KevinH is online now   Reply With Quote
Old 01-11-2021, 11:58 AM   #64
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,736
Karma: 5446592
Join Date: Nov 2009
Device: many
qcssparser and final version of cssparser_v2.1

Okay, I finished porting cssparser v2.1 to use Qt.

See the attached: qcssparser.zip

To build it requires Qt: On my machine you would do the following:
export PATH=${PATH}:/Users/kbhend/Qt5129/bin
unzip qcssparser.zip
cd qcssparser
qmake
make

It has a similar main.cpp as cssparser2 to allow you to see the parser output and the serialized result.

This means I have probably finished with updating cssparser anymore as Sigil would need a parser that understands Qt strings and containers with their built in support for unicode.

All future work will focus on qcssparser and trying to incorporate it into Sigil itself if we decide to go that way in the end.

For the record, I have attached to this issue cssparser_v2.1.zip and qcssparser.zip for anyone who might want to play around with either.

Last edited by KevinH; 01-12-2021 at 12:46 PM. Reason: Remove now outdated versions - see later posts for updated zip archives
KevinH is online now   Reply With Quote
Old 01-11-2021, 01:40 PM   #65
BeckyEbook
Guru
BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.
 
BeckyEbook's Avatar
 
Posts: 704
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
I unpacked 1000 CSS files from EPUB files, ran cssparser on them and it doesn't look bad.
It found quite a few bugs in purchased files, although – luckily – most of it was fine.


I noticed that cssparser ignores the missing last brace:

Code:
.kevin {
    color: blue;
This is a bug for me, http://csslint.net/ shows it nicely.

Spoiler:
For the curious:
There were only 6 files with errors, but as many as 526 there were warnings.
No files had the "Information" status.

The error in the files was always the same: duplicate opening brace.
The warnings (over 5,000 in total) concerned, among others:
  • common typing errors (aling, pdding, rightt …)
  • non-existent attributes (border-size, cellspacing …)
  • attributes related to specific browsers or applications (-webkit…, -o…, -moz…, -ms…, -adobe…)
  • attributes with prefix -epub… (-epub-hyphenate, -epub-hyphens, -epub-line-break, -epub-ruby-position, -epub-text-align-last, -epub-text-combine, -epub -text-emphasis-style, -epub-writing-mode)
  • font-related attributes, e.g. OpenType features (font-feature-settings, font-kerning, font-variant-numeric …)
BeckyEbook is online now   Reply With Quote
Advert
Old 01-11-2021, 01:52 PM   #66
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,736
Karma: 5446592
Join Date: Nov 2009
Device: many
CSSProperties.cpp contains the list of all css3 properties this parser groks.

If there are official properties that are currently valid and commonly used in epubs that should be added to that list, please let me know which ones need to be added and I will add them.

As for the missing final } of a selector, I thought, it provided a warning for that and injected its own. I will look into that.

Thanks for all your testing and feedback!

Quote:
Originally Posted by BeckyEbook View Post
I unpacked 1000 CSS files from EPUB files, ran cssparser on them and it doesn't look bad.
It found quite a few bugs in purchased files, although – luckily – most of it was fine.


I noticed that cssparser ignores the missing last brace:

Code:
.kevin {
    color: blue;
This is a bug for me, http://csslint.net/ shows it nicely.

Spoiler:
For the curious:
There were only 6 files with errors, but as many as 526 there were warnings.
No files had the "Information" status.

The error in the files was always the same: duplicate opening brace.
The warnings (over 5,000 in total) concerned, among others:
  • common typing errors (aling, pdding, rightt …)
  • non-existent attributes (border-size, cellspacing …)
  • attributes related to specific browsers or applications (-webkit…, -o…, -moz…, -ms…, -adobe…)
  • attributes with prefix -epub… (-epub-hyphenate, -epub-hyphens, -epub-line-break, -epub-ruby-position, -epub-text-align-last, -epub-text-combine, -epub -text-emphasis-style, -epub-writing-mode)
  • font-related attributes, e.g. OpenType features (font-feature-settings, font-kerning, font-variant-numeric …)
KevinH is online now   Reply With Quote
Old 01-11-2021, 02:13 PM   #67
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,736
Karma: 5446592
Join Date: Nov 2009
Device: many
Okay I tested cssparser_v2.1 against the following broken css

Code:
.kevin {
    color: blue;


.new { color: Teal; background-color: #FFFF00; }
.deleted { color: red; background-color: #FFFF00; text-decoration: line-through; }
And it produced 2 errors:

Error: 5: Unexpected character '.'in property name
Error: 5: Unexpected character '{'in property name

and the following output:

Code:
.kevin {
    color:blue;
    newcolor:Teal;
    background-color:#FFFF00;
}
So without a proper closing selector it kept trying to build the current selector list of properties and values and of course then detected the "{" as being an illegal part of a property name.


If the snippet:
Code:
.kevin {
    color: blue;
appeared at the very end of the css, it seems to not detect it at all.

I will add a test at the end of the css parsing that will keep track of nest level and generate an error if not properly nested.
KevinH is online now   Reply With Quote
Old 01-11-2021, 02:26 PM   #68
BeckyEbook
Guru
BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.
 
BeckyEbook's Avatar
 
Posts: 704
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
Here is a list of attributes with the prefix "-epub":
http://idpf.org/epub/301/spec/epub-c...haracteristics
http://idpf.org/epub/301/spec/epub-c...l#sec-css-text
and the following paragraphs

In spec 3.2 some of the attributes prefixed with "-epub" were removed:
https://www.w3.org/publishing/epub3/...l#sec-cdoc-css
BeckyEbook is online now   Reply With Quote
Old 01-11-2021, 02:34 PM   #69
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,736
Karma: 5446592
Join Date: Nov 2009
Device: many
Okay, I have kept track of the selector nesting level and if it reaches the end and is not back to zero, this is now detected as an error in my version of cssparser_v2.2.

Code:
KevinsiMac:cssparser2 kbhend$ ./release/cssparser/cssparser ~/Desktop/junk.css
Errors: 1
  Error: 5: Unbalanced selector braces in style sheet
Warnings: 0
Information: 0
Pos: 0 Type: SEL_START  Data: .kevin
Pos: 13 Type: PROPERTY  Data: color
Pos: 20 Type: VALUE  Data: blue
.kevin {
    color:blue;
KevinH is online now   Reply With Quote
Old 01-12-2021, 12:33 PM   #70
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,736
Karma: 5446592
Join Date: Nov 2009
Device: many
qcssparser_v2.2.zip and cssparser_v2.2.zip

Okay the final version of qcssparser_v2.2.zip is attached. It now tracks line number as well as file position. It also detects the errors that it missed ala BeckEBook's bug report. I have also added the -epub specific css properties that are valid in the epub 3.2 spec to the qCSSProperties list.

I will also attached the final version of cssparser_v2.2.zip here just for the record.

Sigil master now has the Qt versions of these files as well
Attached Files
File Type: zip qcssparser_v2.2.zip (19.8 KB, 135 views)
File Type: zip cssparser_v2.2.zip (31.2 KB, 132 views)

Last edited by KevinH; 01-12-2021 at 01:16 PM. Reason: Adding second attachment
KevinH is online now   Reply With Quote
Old 01-13-2021, 03:56 PM   #71
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,736
Karma: 5446592
Join Date: Nov 2009
Device: many
FWIW - I have now modified the old CSSInfo.cpp approach used to identify unused stylesheet classes to walk very tenderly when seeing selectors that use combinators (blank, +, ~, >).

It will basically only look at the first part of the selector (pre combinator) and if that is used at all then assume that its full selector (with combinator) is used.

This will buy us some time to come up with the better solution for Sigil

Right now the qCSSParser should help but without a really good Query library we will be out of luck and testing that and trying to improve it will be my next project.

Even if we get that working on some limited set of queries, if wrCisco's python3lib approach can work, we may be better off using that as it is much more proven.
KevinH is online now   Reply With Quote
Old 01-14-2021, 11:50 AM   #72
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,736
Karma: 5446592
Join Date: Nov 2009
Device: many
Need large set of CSS stylesheet test cases

Hi All,

I have started work on the gumbo query code and have fixed a few more things in their selector parser code.

So what I need is a large set of css stylesheets that I can parse to extract the selectors from and pass them to the Query parser to stress test it.

If anyone is willing to share a large set of actual css stylesheets with me (just stylesheets not epubs) please post me with a link to "KevinH".

One question for those associated with MR's set of public domain epubs, how fancy are the css stylesheets used in those epubs in general? I am specifically looking for stylesheets that use pseudo elements and pseudo classes or combinators.

Do you think I would I be able to generate a my own good set of stylesheets by randomly downloading a set of public domain epubs from MR and extracting their style sheets? Or would they all just be standard element and .class selectors?


Thanks
KevinH is online now   Reply With Quote
Old 01-14-2021, 12:06 PM   #73
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 36,784
Karma: 146617620
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by KevinH View Post
One question for those associated with MR's set of public domain epubs, how fancy are the css stylesheets used in those epubs in general? I am specifically looking for stylesheets that use pseudo elements and pseudo classes or combinators.

Do you think I would I be able to generate a my own good set of stylesheets by randomly downloading a set of public domain epubs from MR and extracting their style sheets? Or would they all just be standard element and .class selectors?


Thanks
Pretty well all of the MR epubs I've downloaded have had simple stylesheets.
DNSB is online now   Reply With Quote
Old 01-14-2021, 02:34 PM   #74
PeterT
Grand Sorcerer
PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.
 
PeterT's Avatar
 
Posts: 12,262
Karma: 74007256
Join Date: Nov 2007
Location: Toronto
Device: Nexus 7, Clara, Touch, Tolino EPOS
I wonder is some of the stylesheets used by Standard eBooks https://standardebooks.org/ might be of use.
PeterT is offline   Reply With Quote
Old 01-14-2021, 03:10 PM   #75
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,736
Karma: 5446592
Join Date: Nov 2009
Device: many
Excellent suggestion! I did not know that site even existed. I grabbed one epub randomly (the advanced Readium version) and it had 3 css sheets that involved a number of complex selectors.

This will be a great source for tests.

Thank you!


Quote:
Originally Posted by PeterT View Post
I wonder is some of the stylesheets used by Standard eBooks https://standardebooks.org/ might be of use.
KevinH is online now   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
alphabetizing stylesheet, check book, and remove unused styles rjwse@aol.com Calibre 9 01-29-2020 06:48 PM
Pseudo classes to be deleted as unused classes Leonatus Sigil 2 09-23-2018 09:12 AM
"unused stylesheet class" is actually used AlanHK Sigil 6 06-20-2017 04:42 PM
Search and Replace; delete "author" name from "serie" roosten Library Management 6 12-17-2015 11:38 AM
Cleaning a stylesheet of unused styles roger64 Sigil 49 06-13-2012 05:23 AM


All times are GMT -4. The time now is 04:43 PM.


MobileRead.com is a privately owned, operated and funded community.