Instead of defining them independently from each other, define one's position in terms of her absolute location and defining the second person by his location relative to the first person, like the difference between world and local coordinates. His relative position would be described in a coordinate system with the origin representing the first person. As an example, her position could be $(2,4)$, which is her absolute position, and his relative position could be $(3,3)$, which means he is 3 units to the right of her and 3 units above her. Then, you would need to adjust his relative random walk. For instance, if both people can go one space in 4 directions, and both move one space up, in his relative coordinate system, he does not move. If she moves up and he moves down in the absolute coordinate system, he moves down two spaces in the relative coordinate system. If he moves up and she moves right, he moves up one space and left one space in his coordinate system. You can calculate the probabilities for these movements, and simulate them in his own random walk separate from hers. You stop when he reaches $(0,0)$ and just continue to simulate her random walk.
In the 2D case where they can move one space in any of the four directions on a graph, they cannot stay in the same location, they move at the same time and speed, and their movements are discrete (meaning they teleport from $(1,1)$ to $(1,2)$. They cannot meet if they cross the same path but end up at different nodes. For instance, they would not meet if she moved from $(1,1)$ to $(1,2)$ and he moved from $(1,2)$ to $(1,1)$.), her random walk would be
$$
\begin{matrix}
(x,y+1) & \frac14 \\
(x,y-1) & \frac14 \\
(x+1,y) & \frac14 \\
(x-1,y) & \frac14 \\
\end{matrix}
$$
where the first column represents the change and the second column represents the probability of the change. For the man's relative coordinates:
$$
\begin{matrix}
(x,y+2) & \frac1{16} \\
(x,y-2) & \frac1{16} \\
(x+2,y) & \frac1{16} \\
(x-2,y) & \frac1{16} \\
(x+1,y+1) & \frac18 \\
(x+1,y-1) & \frac18 \\
(x-1,y-1) & \frac18 \\
(x-1,y+1) & \frac18 \\
(x,y) & \frac14
\end{matrix}
$$
The first four happen when the two people move in opposite directions. The next four happen when they move in perpendicular directions. The last one happens when they move in the same direction.
Once again, you model them independently, and you stop for the man when he reaches $(0,0)$.
You can use this general method to fit the exact details of your specific random walk.
Answering Questions
I generally do not use papers from academic journals, so I don't really have any sources. However, I think you could probably find what you're looking for if you search terms like "local coordinate system," "coordinate transforms," etc. My thought process came from both the Schrodinger Two Body Equation and the game engine I'm working on, so unless you're planning to go into quantum physics or video game development, I have nothing for you. The only other thing I had in mind when coming up with this solution was that the coordinates at which they meet did not matter.
This kind of coordinate change shows up in a lot of places, like in solving the Two-Body Schrödinger Equation, in video games with local and world/global coordinates, etc.
In video games, you generally have an object with something like a center of mass in the Global/World Coordinate system that moves around in the World Coordinate system and then you have a body for the object that consists of points whose positions you define in local coordinates, then you transform the local coordinates of the body into world coordinates. For instance (and this is mainly how I use it), let's say I have an asteroid moving at a constant velocity and rotating. I would move the center of mass using $d=rt$ (I'm making some simplifying assumptions. Normally, I would use something like RK4, but it boils down to $d=rt$ in this case.) Then, I would rotate the body of the asteroid using angular momentum and the inverse inertia tensor for that body. Then, to calculate where the points are in the world space, I add the rotated local coordinates of the body to the center of mass. This allows me to separate the linear components of motion with the rotational components.
Here is a related question that goes more into only being concerned with when they meet in a one-dimensional random walk.
An interesting thing to note is that I had this exact problem come up in a completely random idea I had for part of a video game, where I put four people on a finite graph that had four possible pathways for each of its 30 nodes and would wrap around, so that going up from the top node would put you in a bottom node (like Pac-Man or Asteroids) such that they could only leave if they all ended up in the same room. I remember that I did it differently (I also used Microsoft Excel to simulate the entire thing.), where I assumed that they had a $20%$ chance of going through any of the four pathways or staying exactly where they were. I made a huge $30\times30$ Transition Matrix for the Markov Chain, put each of the four players in a different room (represented by a $1$ in a column vector of zeros), and iterated the Markov Chain until they had an almost equal chance of being in any room. I then did a bunch of math with the data I had to find some expected values and other useful info. For fewer people or when people met up, I would start the simulation over again with the number of distinct groups, so if two people found each other, but the other two people were still alone, I would simulate three groups total. I had completely forgotten about it until now. I did not use the local vs. world coordinates because I had just found out about Markov Chains, so I tried forcing them into everything I could. I'm not too sure about this method's accuracy, though. I would have to go through the math again.