-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Data model" page is too long #126053
Comments
Try not to break all the external links going into the pages. We don't want to invalidate all the references from blogs, tweets, stackoverflow answers, etc. With regard to search engine results, I don't think we can or should engage in SEO. There is no promise that rearrangements will lead to being a top hit for a search. |
Presumably this is impossible, right @picnixz? |
My suggestion for refactoring these large pages while mitigating the damage to existing deep links was to:
The damage to existing deep links that can't (or won't) be changed is still a good reason to tread carefully, but never being able to split pages as they grow over time isn't a great situation either. For more background on why we should preserve link integrity as much as we can, the World Wide Web Consortium has a decent page here on why "Cool URIs Don't Change": https://www.w3.org/Provider/Style/URI |
Mmmh. It could be possible actually but this would require a custom Sphinx extension and custom redirection
Alyssa's suggestion on having a page serving as a hub is possible but it will be a bit ugly (because we still need to make all possible anchors available on that page so that users can re-click on them to have the expanded content). |
Could the Sphinx extension glue together several pages to form
|
The original Py2-as-default -> Py3-as-default in https://peps.python.org/pep-0430/ was certainly all server-side redirect config. And yeah, I agree the orphaned navigation page isn't a good solution, it's just a better option than leaving people with either a 404 or an unanchored link to the start of a page with less inline content. Unfortunately, web server rewrite rules can't help us here, as the anchor tag part is never sent to the server - it's handled by the browser after downloading the page. HTTP redirects don't help either, as they also operate at the page level. It should be possible to do something clever with client side JavaScript: https://stackoverflow.com/questions/1305211/javascript-to-redirect-from-anchor-to-a-separate-page (and that could potentially be extended further to handle smaller cases like the deep links I recently broke by moving the |
If you're worried about the length of Ah yes, I forgotten about the redirection using JS. I was confused because I actually thought about server-side rendering. Now using JS can be integrated in Sphinx directly (IIRC). |
We've no way of knowing which of the 18k words (or 25k in #126052) is the important text that any given visitor is interested in. That's why more granular pages will help. |
(We may want to break out a separate pre-requisite issue for this, but continuing here for now) Summarising what a potential solution to allowing moving link targets between pages, or making other changes (like updating section headings) without breaking deep links to those anchors:
This is still @picnixz's "custom Sphinx extension" idea, just with a better idea of what that extension would need to offer to enable docs refactoring without worrying about breaking existing deep links. If this existed, my orphaned navigation hub idea wouldn't be needed. |
I like the idea of using the intersphinx data. Here's a script that uses sphobjinv to print links that have died in the 3.14 docs: from sphobjinv.inventory import Inventory
def load(url):
inv = Inventory(url=url)
return {obj.uri_expanded for obj in inv.objects}
old_urls = load('https://docs.python.org/3.13/objects.inv')
new_urls = load('https://docs.python.org/3.14/objects.inv')
dead_urls = old_urls - new_urls
for url in sorted(dead_urls):
print(url) Current output
|
A very basic solution might be to redirect users to |
This comment was marked as outdated.
This comment was marked as outdated.
@nedbat Does the docs WG want to take a position with regard to docs stability versus refactoring into smaller chunks in hopes that SEO will be improved? |
I think this should be motivated not just by SEO, but also by improving the usability of the docs. It's a very large file that covers a lot of ground, and the way it's organized isn't necessarily the best. That may be bad for SEO, but it's also not ideal for human readers. Currently the file has not just a discussion of Python's general "data model", the way data is represented, but also detailed documentation about some precise types, such as code objects. That documentation might fit better at https://docs.python.org/3/library/types.html#types.CodeType, so the data model page can focus more on behavior of the core language. Similarly, the data model page has discussion of numbers.Number and similar classes, which feels a bit out of place, as those are library ABCs, not core parts of the language. On the other hand, I also agree we should avoid breaking links. If we want to be very strict in this, we could build some tooling that records e.g. all anchor targets in an old version of the docs and asserts they continue to work. |
A few general considerations on splitting up a long page in the Language Reference (which this is). I'm speaking from my perspective and not for the entire @python/editorial-board. I would urge us to be more conservative with the Language Reference docs than the Library docs since it is the definition of the Python language.
|
@JelleZijlstra's example is in line with my thinking when it comes to Language Reference changes vs. Library Doc changes. |
To be clear, UX and discoverability are the entire reason I care about SEO here! |
I understand your intent. To restate, if improvements to SEO impact negatively UX and discoverability, we should pass until the negatives are mitigated. As an aside, the exclamation point wasn't necessary in the earlier response. |
Sorry! |
I think the page is too long, and would improve both UX and SEO to be split up. It sounds like there is probably a way to reasonably preserve old links, though that still needs some investigation. It's a big job that should be done with care. |
As there seems to be consensus that a technical improvement around preserving deep links is needed before we embark on any major layout changes, I filed that request as a |
Another good first step is making a concrete proposal about how the page would be split up. I know from my own work on the devguide that it's easy to look through an existing document and be certain that it could be reshaped into something better. When you actually sit down to do the reshaping, difficulties arise, decisions have to be made, and so on. Does someone want to write a doc somewhere that shows how a split page would be structured? |
My first impression is: split them by classes first. They are good on their own IMO. And each class can by regrouped by topic (e.g. strings, numerics, collections, etc). I can sketch a rough idea if you want (maybe by the end of the afternoon) |
I'm not sure about the Data Model page, but @nedbat's question prompted me to add a draft split for the builtin types page in #126052 (comment) (giving |
Perhaps the most conservative first iteration after getting the linking resolved would be to split the doc where there are natural breaks: 3.1, 3.2, 3.3 and 3.4. This will keep familiarity initially, and it does not preclude us from further splitting classes and 3.2 in future iterations. |
Documentation
The Data model document is very long, and as a result it basically never shows up in search engine results, because 90% of the page is considered irrelevant for any query like "python __hash__".
I suggest we split it up by top-level topic, e.g. we add a dedicated page for "Special method names".
See also #126052
The text was updated successfully, but these errors were encountered: