somegrue
Generous
- Joined
- 16 Sep 2023
- Messages
- 1,482
- Appreciation score
- 173
Picking up where you left off, though, the problem seems to specifically be that underscores aren't recognized as whitespaces: "kylie_minogue" does find the thread in question, along with a couple of others. Ditto, "tension" fails, but "tension_" works.
Next quirk: Do dots work as whitespaces? Looks like the answer is... complicated, of course.
Using "New.Jack.City.1991.iNTERNAL.DVDRip.x264-REGRET", currently the most recent request in scnlog.me/forum/node/movies.7, to investigate, it looks to me like the engine treats this as four "atoms", namely "New.Jack.City", "1991", "iNTERNAL.DVDRip.x264", and "REGRET". Searching for any, or any combination, of those succeeds:
"New.Jack.City", "1991", "iNTERNAL.DVDRip.x264", "REGRET"
"New.Jack.City iNTERNAL.DVDRip.x264", "REGRET 1991"
If dots were treated as whitespaces, the first and third ought not be atomic, but trying to split them fails:
"New", "Jack", "City"
"iNTERNAL.DVDRip", "DVDRip.x264"
So it's only the dot before and after the year that work as expected. Why? I'm thinking it's less about dots as such and more about boundaries between alphabetic and numerical characters, in some fashion. Bit vague to be of much use, but I'm afraid that for now, I'm just going to leave it at that!