Summarizing the clarifications above as an answer:
There are multiple ways to generate tangent spaces for a mesh, and not all of them agree on the result.
This is a common source of rendering errors in games, where the normal map baking tool generated the texture with respect to one tangent space, and the 3D modelling software or game's mesh importer decided on a different tangent space, leading to mismatches and artifacts, as shown in these examples from the Handplane documentation:

So, we need to pick a standard tangent space to use.
A popular choice is "MikkTSpace" a method for generating tangent spaces that Morten S. Mikkelsen developed as part of his master's thesis. He specifically designed the algorithm to be robust for use by multiple tools in an asset pipeline, so they can independently generate the same tangent basis regardless of quirks like choices of vertex welding or the order of the vertices & faces.
Code for the MikkTSpace algorithm is freely available online, and I'm not an expert in all of its workings, so I won't describe a complete implementation here. Instead I'll address the specific questions about it raised above.
Why do these algorithms disregard smoothing groups?
"Smoothing groups" don't really exist on GPUs or in most game engines - they're a concept used in 3D modelling tools to make manipulating normals more intuitive.
By the time a mesh is pushed down the graphics pipeline, it's just a raw stream of vertices combining position, texture coordinates, normals, etc.
Wherever two entries in this stream coincide at the same position, but have different normals, you'll get a hard crease or lighting seam along any edges they share (or a point discontinuity like the tip of a cone if they don't share any edges)
Wherever you have an edge where all vertices at the start of the edge agree on their normal, and all vertices at the end of the edge agree on their (possibly different) normal, you'll get a smooth join with no discontinuity.
Smoothing groups exist to tell the 3D package where it should force vertices along a shared edge to share a normal, versus where the normal can be independently chosen for each vertex.
By the time the mesh gets to the tangent space baking step, these smoothing groups have typically already been converted to vertex splits, so the tangent space algorithm doesn't need to be aware of a particular tool's smoothing conventions - it can just work with the literal vertex data.
No artist-authored smoothing information is lost this way, it's just already been translated into the lower level form these algorithms understand.
Why does the MikkTSpace algorithm group only faces that share two vertices, two texture coordinates, AND two normals?
"Suppose you are approximating a sphere by something like an
icosahedron. In this case there will be no vertex normals shared by
any two vertices, so each individual triangle is going to have its own
tangent space. This seems counterintuitive."
This looks like a misunderstanding of the algorithm, as though it said:
"Two triangles can be grouped only if the vertices at both ends of the shared edge agree on a single shared vertex normal."
But it actually says:
- two vertices [(vertex positions)] are shared.
- vertex normals at the two vertices are shared.
- texture coordinates at the two vertices are shared.
- the triangles must have the same sign of det(T). [(ie. Either neither, or both, are mirrored in texture space - not one of each)]
Point 2 means that:
- triangle A and triangle B's vertices at the START of the shared edge must share one vertex normal
- triangle A and triangle B's vertices at the END of the shared edge must share a (possibly distinct) second vertex normal
...for a total of two shared normals along the common edge.
So you can see this is the same condition we described above with regard to smoothing groups. If this condition is not met - the vertices at at least one end of the edge disagree about their normal direction - then we'll have a sharp crease along this edge. Since the normal experiences a sharp discontinuity along this edge, the tangent basis (which includes the normal) will also be discontinuous along this edge.
Why does the algorithm accumulate magnitudes of vectors?
Checking through the code, you'll find the magnitudes of the partial derivatives are not used for the "basic" version of the tangent space.
For the more advanced version, here is Mikkelsen's original comment:
// This function is used to return tangent space results to the application.
// fvTangent and fvBiTangent are unit length vectors and fMagS and fMagT are their
// true magnitudes which can be used for relief mapping effects.
Relief Mapping is an effect where we approximate parallax and self-occlusion of surface structure by ray-marching through a height field texture from our initial surface sample point. We imagine the surface has some depth to it, and that our view ray can continue some distance from where it hit the bounding volume of the polygon geometry, before it actually hits the displaced surface underneath.
To make it work, we need to transform our view vector from eye/world/object space into texture space:

(Diagram of Relief Mapping from GPU Gems)
To do that precisely, we need to know more than just how the texture is oriented with regard to the 3D geometry (what we get from tangent & normal directions), we also need to know how it's scaled. If we ignore this, then a ray entering a compressed part of the texture will behave like it's refracted in water, covering less worldspace distance parallel to the surface for each unit of travel perpendicular, distorting the effect. We can use the magnitude information provided by the algorithm to compensate and ensure our texture space ray matches the direction of our view ray in the world.
There are other effects that might benefit from having this type of scale information about the texture mapping (maybe some forms of tesselation using a control texture?) but if you're just using the tangent space for standard normal mapping and only care about directions, then you can safely ignore the magnitude tracking the MikkTSpace algorithm does and use the basic version instead.