[Lldb-commits] [PATCH] D131085: [lldb/crashlog] Refactor the CrashLogParser logic

Tue Aug 9 13:31:43 PDT 2022

mib added inline comments.

================
Comment at: lldb/examples/python/crashlog.py:434
         except CrashLogFormatException:
-            return TextCrashLogParser(debugger, path, verbose).parse()
+            return  object().__new__(TextCrashLogParser)

----------------
kastiglione wrote:
> kastiglione wrote:
> > mib wrote:
> > > JDevlieghere wrote:
> > > > kastiglione wrote:
> > > > > mib wrote:
> > > > > > JDevlieghere wrote:
> > > > > > > mib wrote:
> > > > > > > > kastiglione wrote:
> > > > > > > > > I have not seen the `object().__new__(SomeClass)` syntax. Why is it being used for `TextCrashLogParser` but not `JSONCrashLogParser`? Also, `__new__` is a static method, could it be `object.__new__(...)`? Or is there a subtly that requires an `object` instance? Somewhat related, would it be better to say `super().__new__(...)`?
> > > > > > > > > 
> > > > > > > > > Also: one class construction explicitly forwards the arguments, the other does not. Is there a reason both aren't implicit (or both explicit)?
> > > > > > > > As you know, python class are implicitly derived from the `object` type, making `object.__new__` and `super().__new__` pretty much the same thing.
> > > > > > > > 
> > > > > > > > In this specific case, both the `TextCrashLogParser` and `JSONCrashLogParser` inherits from the `CrashLogParser` class, so `JSONCrashLogParser` will just inherits `CrashLogParser.__new__` implementation if we don't override it, which creates a recursive loop.
> > > > > > > > That's why I'm calling the `__new__` method specifying the class.
> > > > > > > What's the advantage of this over this compared to a factory method? Seems like this could be:
> > > > > > > 
> > > > > > > ```
> > > > > > > def create(debugger, path, verbose)
> > > > > > >     try:
> > > > > > >             return JSONCrashLogParser(debugger, path, verbose)
> > > > > > >         except CrashLogFormatException:
> > > > > > >             return  TextCrashLogParser(debugger, path, verbose)
> > > > > > > ```
> > > > > > If we make a factory, then users could still call `__init__` on `CrashLogParser` and create a bogus object. With this approach, they're forced to instantiate a CrashLogParser like any another object.
> > > > > `CrashLogParser.__init__` could raise an exception. With intricacy of this approach, maybe it's better to use a factor method combined with an exception if the base class `__init__` is called.
> > > > +1, or maybe `abc` provide a capability to achieve the same?
> > > IMHO, having to call an arbitrary-called method (`create/make/...`) to instantiate an object and having the `__init__` raise an exception introduces more intricacies in the usage of this class, compared to what I'm doing. 
> > > 
> > > I prefer to keep it this way since it's more natural / safe to use. If the implementation exercises some python internal  features, that's fine because that shouldn't matter to the user.
> > Only after discussing it with you, and reading python docs, do I understand why this code is the way it is. Future editors, including us, could forget some details, which isn't great for maintainability.
> > 
> > You mention the user, are there external users of this class hierarchy? Or are these classes internal to crashlog.py? If the latter, then the simplified interface seems hypothetical. If there are external users, how many are they? I am trying to get a sense for what is gained by the avoiding a factory method.
> here's an idea that may simplify things:
> 
> instead of embedding validation inside `JSONCrashLogParser.__new__`, have a static/class method for validation. Then, the base class can decide which subclass by calling the validation method. This means the subclasses don't need to override `__new__`. Ex:
> 
> ```
> class Base:
>   def __new__(cls, i: int):
>     if Sub1.is_valid(i):
>       return object.__new__(Sub1)
>     else:
>       return object.__new__(Sub2)
> 
>   def __init__(self, i: int):
>     print("Base", i)
> 
> class Sub1(Base):
>   @staticmethod
>   def is_valid(i: int):
>     return i == 1234
> 
>   def __init__(self, i: int):
>     super().__init__(i)
>     print("Sub1", i)
> 
> class Sub2(Base):
>   def __init__(self, i: int):
>     super().__init__(i)
>     print("Sub2", i)
> ```
@JDevlieghere might be able to give more details on our users but `crashlog.py` could be imported in other python script, and from what he told me, we have some internal users that rely on it.

I don't have any strong opinion wrt your suggestion, I'll try it out and update the patch if that works well. Thanks @kastiglione !

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131085/new/

https://reviews.llvm.org/D131085