MobileRead Forums - View Single Post - Plugboard, template, and custom composite column recipes

chaley · 05-06-2024, 06:20 AM

Quote:

Originally Posted by BrandonGiesing

I currently have the Kobo driver set up so anything in the default "tags" column becomes a collection which works fine for the most part but my Kobo takes like 2 minutes to load the Collections page cause I have 5000+ tags (I know, I need to go through and fix things)

I was wondering if there was anyway to use a template to read the item count of the tags so it would only create a collection if a tag has been used like at least ~5+ times and ignore tags below that, that should eliminate a good chunk of the 5000+ and keep the most important ones.

Yes, this can be done. But to pontificate a bit before going there, I think this is a waste of time. What is "important"? Given spelling variations such as "Science Fiction", "SF, "SciFi", "Sci-Fi", "Science Fiction and Fantasy", and the like; there is no reason to believe that the limited set of tags is useful information. Example, Gibson's work can be called "Science Fiction - Cyberpunk", "Cyberpunk", "SF Cyberpunk", etc. These tags could easily disappear. What I would do is to first use "Manage tags" and fix spelling variations, which can be done fairly quickly. Better would be to build a hierarchical system that makes sense to you then build collections from the first N levels, where N makes sense to you. This second is what I actually do.

Pontificating aside, one can do what you want with a template-language template but with 5,000 tags it would be excruciatingly slow. This python template doesn't have that problem.

Code:

python:
def evaluate(book, context):
	field = 'tags'

	# Get the previously computed set of acceptable items, if it exists
	all_names_over_count = context.globals.get('all_names_over_count')
	if all_names_over_count is None:
		db = context.db.new_api
		all_names_over_count = set()
		counts_by_item = db.get_usage_count_by_id(field)
		for item,count in counts_by_item.items():
			if count > 5:
				all_names_over_count.add(db.get_item_name(field, item))
		context.globals['all_names_over_count'] = all_names_over_count

	# Check if the current book has any of the acceptable items
	item_names_in_book = []
	for name in book.get(field):
		if name in all_names_over_count:
			item_names_in_book.append(name)
	return ':@:'.join(item_names_in_book)